PIC VTI/OSD

PIC based Video Time Inserter (VTI) and On Screen Display (OSD)

This is essentially an exercise to explore the limits of the PIC device as a VTI (Video Time Inserter) or On Screen Display (OSD). For VTI operation, the biggest problem turns out to be extracting the Frame and Line Sync from the incoming AV signal. The biggest limitation of the 16Fx (20MHz) devices in VTI/OSD use is the low pixel resolution - only some 25 characters across the screen (and a max or 40 lines) is achievable, which is only just enough for a 'chunky' VTI adding data and time to a CCTV video stream (as a 'banner' across the bottom).

The VTI requirement

A common requirement for any CCTV system is to insert the date and time in a 'guaranteed' way that avoids any possibility of error. Essentially, this means adding the date/time to the 'live' AV video feed (i.e. before it is recorded) using a GPS or NTP source derived date and time. This means building a type of On Screen Display (OSD) device known as a Video Time Inserter (VTI)

The only 'guaranteed' date/time source for a totally 'stand-alone' VTI is the GPS system, however it may be 'acceptable' to run the device as a RTC (Real Time Clock) from a manually entered (or serially transmitted, NTP derived) 'start' time/date. Since most CCTV system will always be 'powered on', it may even be possible to dispense with the normal RTC battery back-up - which means the RTC 'counter/timer' can be combined with the VTI function into a single PIC chip.

Most CCTV cameras are of very poor resolution. The 'CIF' standard generates non-interlaced 'frames' of 240 scan lines with an effective pixel resolution of about 352 pixels per line - these are designed to be used with a 'TV standard' VHS analogue tape recording system that combines 4 camera inputs (4 x 240x352 frames fit nicely into one 480x704 frame = which is NTSC TV resolution)

Anyone who has seen CIF CCTV playback knows how poor the quality is and how it is almost impossible to positively identify anyone from the recording - unless the subject happens to look straight at the camera from a few feet away. CIF should only be used as a 'spy hole' camera on an entry door

The 'High Resolution' CCTV standard 'full D1' camera is designed for direct recording to the 704 x 480 (NTSC resolution) 'VHS' analogue recording system. This is what most people would regard as the 'minimum acceptable' quality, however this is still only 704x480 = 1/3rd of a mega pixel (about the same as a typical 'VGA' web-cam)

Apparently D1 PAL (720 x 576) camera's do exist, however the price is likely to be prohibitive.

Note that the resolution of 'full D1' cameras is said to be '700TV lines', which is a 'marketing description' (aka, a lie). The actual resolution is 480 lines (x 704 pixels)

Vendors of hugely expensive CCTV recording equipment have always tried to distance themselves from the cheap domestic VHS tape recorder on which their equipment is based, however all analogue tape systems are based on $100 NTSC, or (sometimes) PAL video-tape recorders with $1000 markups

To accommodate cheap web-cams, there is also a '4CIF' = 640x480 (VGA) standard, which uses the same VHS recording system as D1 (704x480). The 'clever' CCTV recording systems add 'black bars' to the sides, others just 'aspect ratio distort' the 640 (by recording analogue 640 as 704).

What about HD CCTV ?

The CCTV standard for HD is known as 'HD-SDI' (analogue, over co-ax) or HD-IP (digital, over ethernet) and the cameras (1080p) are designed to be used with a 'normal' DVR video recorder with a 'digitising' front-end.

For HD-IP a 'NVR' (Networked Video Recorder) is used - essentially a PVR/DVR with a slightky tweaked 'front end' that accepts only digital data streams (usually from an from Ethernet 'source')

These devices are virtually identical to the domestic DVR's that records HD straight off the 'air waves' (although in some cases the $x,000 price hike may be justified by an ability to record multiple channels at the same time).

Many CCTV 'HD cameras' are stated to have a 'resolution' of '1000TV lines'. Since you will recall that this is a 'marketing description' (lie), you won't be fooled into thinking it's 1080 lines x 1920 pixels. In fact, a '1000 line' CCTV camera is actually generating 1000 pixels per horizontal line (and says nothing about the vertical line count). Since the real HD pixel count is 1920, a '1000 TV line' camera is actually generating only about half the normal HD resolution (which means, of course, that they only have to support half the bandwidth)

Further, the HD-IP industry claim to support 16 HD cameras via a standard Ethernet cable !
 
Dividing 100mbs by 16 gives us a 'per camera' max data rate of 6.25 mbs i.e. less than 1 Mbyte/s, which is actually not that different from the joke 'web HD' broadcasts although recently (2016) some 'on-line' content providers have started to specify 2.5mbs as the 'minimium' requirement for their service
 
However 1mbs is a long way short of the 9mbs supported by AVCHD, let alone the 30mbs (4 Mbyte/s) supported by the BluRay standard = but it does explain how they can record 8 or 16 camera's 'at once' on what must be essentially a standard domestic DVR

I don't go into private CCTV recording methods here, however even the most basic USB web-cam will be VGA resolution (600x800) is is already 4 times higher quality than CIF CCTV !

To get you started, I suggest looking into the very excellenbt Open Source iSpy PC based CCTV system. For general video playback, I recomend the Open Source VLC

Following the Pico OSD

The PIC12F683 based Pico OSD shows what can be achieved using a single PIC chip ! (see also video game).

This is what prompted me to 'have a go' using the basic 16F54/7/9, both because that's what I had available and because some 'extra' pins would be needed to 'receive' a NTP/GPS derived date/time sent via a serial data link

At this point I discovered that differences between the PIC16F54 and the PIC12F683 (used by the Pico OSD) rather scuppers my plan.
First, the 16F54 has a single 'TRIS' command, which copies the contents of the Acc to the Tri-state latch of the target PORT (it is not possible to directly 'bit set' or 'bit clear' the Tri-state latch nor to 'shift' the contents of the latch itself).
So the fastest possible 'tri-state' output will thus require at least 2 instructions = 2 CLK's, i.e. one to 'shift' the bit pattern in the source register, and a second to copy the pattern to the TRIS latch - which means a whole PORT has to be dedicated to VTI (since the 16F5x TRIS command will change the i/o mode of every bit in the target PORT).
Worse, however, is that the 16F54 has no 'Shift Acc' (or 'Rotate Acc') instruction (you can't even add the Acc to itself). So you can't shift a bit pattern in the Acc (you can shift it in a reg but you can't send the reg contents direct to the TRIS latch).
So actually using TRIS will 'cost' at least 3 CLK's to output one bit (Rotate reg, Copy reg to Acc, Copy Acc to TRIS). This means 3 CLK's = 12 OSC cycles per bit, which even by overclocking to 25MHz, means each pixel is about 0.48uS 'wide' allowing only 108 pixels per line (or about 18 characters on a PAL 64uS standard line, of which only some 52uS is visible, see below) ...
If we want even half decent resolution, it's plainly not a viable approach to use the tri-state (TRIS) output mode with the 16F54. Instead I will have to leave the PORT enabled and 'buffer' the output with a diode or transistor

I documented my attempts to use the 16F5x anyway, since some of the techniques I devised are 'reused' in my more practical designs. For details, see my next page, PIC16F5x Video Time Inserter

PAL Sync detect

For details of the PAL TV frame timing, exit to this page. PAL Scan-line timing is 64µs (Active Display 51.95µs, Sync 12.05µs (Front Porch 1.65µs, H-Sync pulse 4.7µs, Back Porch 5.7µs))

To find the 'insert line', the PIC must be able to detect both the Frame Sync and Line Sync, so it can 'count' to the 'right' line. The problem is keeping the PIC generated pixels 'in step' with the video stream, and this means 'aligning' the pixels to the line sync with an accuracy of better than 1 pixel time

Since the 16F5x has no 'comparator' circuits (as used by the Pico OSD), at first sight it looked like I would have to use the LM1881N (or similar) video sync separator (8pin DIL) which costs between £1 (eBay, 8-10 off) and £2 (1 off from eBay or CPC).
 
Whilst this might 'do the job', my 'ultra-simple & low cost' goal would no longer be achievable.

The AV signal is (nominally) 1v peak-to-peak. 1v = 'white', 0.3v = 'black', 0v = 'sync', so 'in theory' anything under 0.3v = sync. In theory, all that's needed is some sort of 'high impedance' circuit (to avoid disturbing the video stream) that can 'monitor' the signal and 'spot' when it drops below (or rises above) 0.3v

The analogue AV can be 'tapped' with a LM311 (single), or 1/2 of LM393 (dual) or 1/4 of LM339 (quad), or similar, op-amp based voltage comparator (10p, 10off eBay) or your own design transistor based buffer / level converter and fed to one of the PIC 'digital' input pins.
 
The comparator circuit can be adjusted manually (using a variable resistor or 'pot') until (only) the 'low' going sync is 'seen' by the PIC pin.

PIC software can then 'time' the length of the Sync 'Lo' and thus distinguish between Frame Sync and Line Sync. To allow manual level setting, code may have to be written to drive an LED when the (Frame) Sync is 'seen' to provide visual feed-back (and avoid the need to use an oscilloscope to setup the 'Sync detector').

Detecting the pin state is also a pain
 
The 16F54 has a single 'Interrupt' driven by the Counter Timer (or WDT - Watch Dog Timer), so instead of using Interrupts to 'spot' the Sync signals, the pin will have to be 'sampled' at 'regular intervals'.
 
When the PIC is powered on, we have to keep sampling until we find the frame Sync. however, after that we only need to sample when we 'know' a Sync is 'due'

Unfortunately, even the fastest possible 'sampling' software will be unable to 'lock' the PIC pixel outputs to the video with any degree of accuracy - at best it is possible to achieve a 2-3 CPU CLK cycle 'window', which is 2-3 pixel widths and thus leads to unacceptable 'tearing' of the text !

To achieve a stable text display, we need a way to 'lock' the text pixel timing to the horizontal line start at 'better than CPU Clk' accuracy

PAL TV video pixel rates

One PAL video scan line is 64uS (line rate is thus 15.625 kHz), of which no more than 52uS contains the visible picture data.

The PAL standard TV line packs 720 pixels over the 52uS, so 0.072uS per pixel and a 'pixel clock' of 13.846 MHz (Wikipedia says the 'standard' is 13.5 MHz, however that's for 704 pixels (TV standard 'oblong' pixels), not for 720 (computer standard 'square' pixels) per line.

The cheapest device that can achieve anything near the full PAL pixel clk rate is the PIC18F14K50 - it's 'max specified' OSC is 48MHz (giving a CPU CLK of 12MHz), so a single cycle 'Rotate to output' instruction could achieve a very respectable 12 MHz pixel rate (in practice the PIC18F14K50 can be overclocked to well over 50MHz, so in practice the full 13.846 (55.384 OSC) should be achievable)

The PIC18F14K50 (at approx £2) is rather more expensive than the 16F54 (for which I paid 40p each) but will give at least double the resolution of the 16F54. The PIC18F14K50 also has a lot more functionality (which makes it's much easier to use in this application) as well as supporting USB (so could even be used as a 'generic' OSD controlled from a PC via USB)



Finally, the PIC18 incorporates 'shift' hardware (intended for the SPI bus) that can be used to output at CPU CLK rate. You can find my 18F14K50 based VTI here

Locking PIC pixels to video data stream

So long as the video is 'stable' (i.e. doesn't itself drift) and the PIC OSC is a 'multiple' of the TV line frequency, 'tearing' can be avoided by 'sync'ing on the first Hysnc for the first 'insert' point and then 'counting' CLK's for each of the next 4 line insert positions. This approach can be extended to sync'ing on VSync (and again counting until the insert point is reached)

Jitter - text line jumping sideways by a pixel - is another problem. This is casued when the Sync happens to occur close to the start (or end) of a PIC VPC CLK and is 'spotted' 1 clokck early (or late).

If the video is not 'rock solid' (i.e. the video line timing drifts) or the PIC OSC is not a simple multiple of the video line frequency, then some means of locking the PIC to HSync (line sync).
 
It's not possible to achieve the require accuracy (at least OSC/2) using the software - at best that can 'detect' the state of a PORT bit only every 2-3 CPU CLK's

There are a number of ways this can be achieved. One would be to use external circuits to buffer the PIC pixel output (eg use an external shift register tied to HSync). Another would be to adjust the PIC OSC (i.e. the CPU OSC/4 phase). A third way is to generate the PIC clock itself from the video data HSync stream (eg using a PLL)

Whatever the solution, some extra circuitry will be needed

Using external shift registers

The advantage of using an external pixel shift register is that it also 'solves' the speed problem (assuming the PIC can loads a 'set' of 5 or 8 pixels to the shift register each time).

To 'sync' the pixel output, the shift register (s/r) would operate as follows:-



Hsync holds the shift register in 'reset' (and disables the s/r load clock), the end of Hsync allows the s/r to start shifting (and reloading on every 8th pixel clock). So long as the PIC presents the data at the s/r pins 'within' the 8 s/r clock time window, the characters will be stable.



During the video line time prior to the actual character display position, the PIC must ensure the s/r pins are kept 'Lo' (i.e. same as the 'reset' level), otherwise unwanted pixels will 'escape' into the video stream



The output of the s/r has to be buffered (using another transistor) so only '1' data bits (the character shape pixels) passed into the video stream

In theory the shift-register could be clocked at PIC OSC, however the CPU (which is OSC/4) could never keep up, even if it loaded 8 pixels at a time (8 pixels at OSC would be output in 2 CPU clks)

So the shift-register must be clocked at OSC/2. If the CPU loads byte (8 pixels) at a time then this allows a more reasonable 4 CPU clocks between byte loads. This doubles the max. pixel rate (and thus the resolution) to (at least) 10MHz

Since a 'divide by 2' circuit will be required, this can also be controlled by Hsync, which can be used to 'start' the pixel clock to an accuracy of 1/2 a pixel (which should be enough to ensure a stable output)

Finally, a means of 'sync loading on every 8th pixel' is required (the obvious method being a second s/r, wired up to present it's 8th bit to the 'sync load' inputs of both itself and the 'pixel' s/r).

Adjusting the PIC OSC

Sync'ing the PIC OSC to the HSync also needs additional hardware. On page 66 of the spec. we note that, that when an external clock is used, the minimum frequency is '0' (i.e. 'stopped'). So, if a transition by one of the PIC i/o pins is used to 'stop' the PIC's own external clock, and the video Hsync used to restart it, we have a way to 're-sync' the PIC OSC and thus the output pixels to the required 1 OSC period !

In the circuit, right, three 74S00 NAND gates are used to build an OSC, with the 4th gate controlled by the output of a 7474 (edge triggered) flip-flop. To 'stop clock', the PIC transitions the I/O pin from Lo to Hi resulting in the CLK being held Hi (it is possible the the CLK should be held Lo (to avoid glitches) instead = it depends on PIC PORT pin timing = in which case a 74S02 NOR package should be used instead of the 74S00 NAND).

One drawback is the difficulty of generating an 'exact' 50:50 CLK (especially vital if we intend to over-clock the PIC) using using NAND (or NOR gates) - a lot of 'trial and error' adjusting Capacitor values will typically be involved. One way to get an exact 50:50 mark/space ratio is to use a 40-50MHz OSC source and the second 7474 as a 'divide by 2'- however building an OSC with a clock frequency in the 50MHz region will not only need fast logic gates but is very difficult to generate with a 'simple' inverter style feed-back circuit

A 48MHz 'can' OSC can be found on eBay for about £1. Almost any frequency can be had and some even have a 'tri-state' control pin (which tri-states the output).

Generating PIC OSC from video HSync

HSync is 15.625 kHz. A PLL circuit contains a digital divider in it's feedback loop - these are (typically) powers of 2. To get 20MHz from 15.625 kHz, the exact divide is 1280, so the nearest power of 2 is 1024, giving us 16MHz.

16MHz OSC means a 4MHz CPU CLK, which is a rather a low pixel rate (208 pixels per visible line)

Time keeping

Unless a RTC or GPS is used, the initial time will have to be set by some external means (manually, using buttons, or by serial link transmission from a PC etc) just like any other clock.

The fractional seconds (.ss) can be derived by 'counting' on alternate Frame Sync pulses (to avoid the value changing between 'half frames') but does mean the accuracy is 1/25th s (.04))

Using a GPS chip

The GPS unit will be used to set the initial time (i.e. after power-on) and at some regular interval there-after.

Unless absolute accuracy is required, the time count will be maintained by counting full video frame VSyncs (this avoids the value changing between half frames but means the accuracy is 1/25th s (.04)).



If higher accuracy is required, HSync pulses can be counted.

Faster output using an external shift register

The output 'resolution' is limited by the PIC 'ROTL PORT' instruction (and the inter-character gap to the time taken to find the next byte value). If an external pixel shift register is used (eg 74HC166) the PIC would only need to 'feed' it complete bytes .. and this can be done at a rate of 2 CPU CLK's per byte (using absolute Register addressing, COPY reg,Acc, COPY Acc,PORT).

In theory this allows the 16F5x, at max. OSC 20MHz, to output 8 pixels in 2 CPU CLK i.e. 2x OSC/4 which is 20 mega pixels per second. This is acyually faster that PAL TV data rate (about 13MHz) In practice,

The drawback is that some sort of external circuit not only has to clock the shift-register, but also has to provide it with a 'load' pulse every 8th clock - and all this has to be kept 'in sync' with the PIC CPU !

Once the cost of all these extra 'counter' chips etc. is added, it becomes cheaper to use a 'high end' 48MHz+ PIC (with it's internal shift register) instead

PIC VTI (OSD)