DExx-vd_isl video digitizer add-on for Intel FPGA dev boards

marqs · Post by **marqs** » Thu Feb 04, 2021 11:11 pm

I've now ported the code to Cyclone V GX Starter Kit ("C5G"). The board / FPGA is somewhat limited in IO as there's only 2x20 GPIO and HSMC, and some of the existing IO is shared between different HW/connectors. The on-board ADV7513 also has none of its audio inputs connected to FPGA although they are wired to test points which could be easily connected to DExx-vd_isl audio outputs. Most on my testing concentrated around SiI1136 daughtercard which seems to work very well - I was able to get 2560x1440@60Hz output operational (via Si5351C generated clock) with and without pixel repetition, and SiI1136 seemed to work in a stable manner with custom drivers. A major surprise was that ADV7513 also output these modes fine - I'm not sure if I was just locky to get a fast chip, but will try that next on DE10-Nano too.

One disappoiment casted a shadow on otherwise good news, though: performance of the C7 grade Cyclone V (28nm) is not that much better than previous generation Cyclone IV (65nm) according to timing analyzer. On slow condition it only guarantees ~200MHz operation which is quite far from 242MHz target. So it seems 2560x1440@60Hz support will be more or less a silicon lottery, even if SiI1136 is used.

marqs · Post by **marqs** » Sat Feb 06, 2021 8:56 pm

2560x1440@60Hz seems to work on my DE10-Nano too even if it requires going way beyond specs of 3 different chips... Updated repo with C5G and DE10-Nano images with added support for 2560x1440 in generic modes (will generate PLL configs for optimized modes later).

6t8k · Post by **6t8k** » Sun Feb 07, 2021 6:56 pm

In some quick tests, 2560x1440@60Hz seemed to work just fine with my DE-10 Nano too. An example: YouTube. Will test 480p input / 16:9 later.

My monitors' specs don't allow me to view the 1920x1080@120Hz that's now available via the test pattern. The capture card should just about handle it (297MHz HDMI RX and supports capture of up to 120fps), but jumps around between "no signal" and "unsupported signal". Tried it with three different HDMI cables, same behavior with each, so possibly signal integrity has taken too much of a hit at an earlier point.

Surprising findings indeed - now what to do?

Do you think it may be worthwhile to try an OSSC Pro prototype with a C6 grade Cyclone V, introducing more leeway there, hopefully excluding faults? Then again, especially if a HW design that incorporates the SiI1136 instead of the ADV7513 isn't yet ready, it'd be a significant risk in various ways again.

In Jan 2020 you reported that you got visible artifacts with the ADV7513 at 2560x1440 w/ reduced blanking using pixel repitition. Since the OSSC Pro spec so far included a C8 grade Cyclone V, can the latter be ruled out as being the bottleneck instead of the ADV7513?

marqs · Post by **marqs** » Mon Feb 08, 2021 10:50 pm

6t8k wrote:My monitors' specs don't allow me to view the 1920x1080@120Hz that's now available via the test pattern. The capture card should just about handle it (297MHz HDMI RX and supports capture of up to 120fps), but jumps around between "no signal" and "unsupported signal". Tried it with three different HDMI cables, same behavior with each, so possibly signal integrity has taken too much of a hit at an earlier point.

I didn't either get 1920x1080@120Hz output on ADV7513 while SiI1136 had no issue with 285MHz as expected (clock generator and FPGA ran at half frequency due to pixel repetition trick).

6t8k wrote:Surprising findings indeed - now what to do? Do you think it may be worthwhile to try an OSSC Pro prototype with a C6 grade Cyclone V, introducing more leeway there, hopefully excluding faults? Then again, especially if a HW design that incorporates the SiI1136 instead of the ADV7513 isn't yet ready, it'd be a significant risk in various ways again.

In Jan 2020 you reported that you got visible artifacts with the ADV7513 at 2560x1440 w/ reduced blanking using pixel repitition. Since the OSSC Pro spec so far included a C8 grade Cyclone V, can the latter be ruled out as being the bottleneck instead of the ADV7513?

There were indeed artifacts back then while quickly testing the mode with older firmware and prototype. However, I just integrated the latest changes and it seems to run fine on the new board. Upgrading to C7 might make sense as it'd bring FPGA performance on par with these Cyclone V development boards. Going to C6 probably pushes price too much while still not guaranteeing each board will run without issues. The official spec for OSSC Pro thus is likely to remain at 1920x1440@60Hz (185MHz) while 2560x1440@60Hz (242MHz) should be treated as an extra that would still work on most cases. The official spec shouldn't either require going beyond limits of other chips, so replacing ADV7513 (165MHz spec) with SiI1136 (300MHz spec) is preferable and would also enable 1920x1080@120Hz as a bonus. Based on recent tests I'm becoming more confident with SiI1136 so the risk of using that in next prototype is not so massive.

6t8k · Post by **6t8k** » Tue Feb 09, 2021 8:49 pm

Thank you for your assessment and prospects.

So far I didn't encounter any problems with 2560x1440@60Hz on my DE-10 Nano. 480p input and 16:9 sampling also works well, example: YouTube*
* YT's compression really made that look at lot worse, it's a mere shadow of DE10-vd_isl's output. Intentionally no audio for the reasons I explained earlier.

Displayed value for H frequency with interlaced input and pure LM Deint+L4x mode are also fixed with the Feb 06 revision.

marqs · Post by **marqs** » Tue Mar 02, 2021 4:59 pm

I've recently run some tests with Intel VIP cores such as Deinterlacer II on OSSC Pro. If there are people interested in helping evaluate and/or integrate functionality which is closer to more traditional scalers, I could port the test branch to some board(s) supported by DExx-vd_isl.

6t8k · Post by **6t8k** » Thu Mar 04, 2021 3:00 pm

I'm a little busy right now, but I'm keen on trying this out - if you port the test branch to DE10-Nano I'll see how I can support.

marqs · Post by **marqs** » Tue Apr 06, 2021 7:55 pm

6t8k wrote:I'm a little busy right now, but I'm keen on trying this out - if you port the test branch to DE10-Nano I'll see how I can support.

The branch is now ported to DE10-Nano. Setting up the system is a bit cumbersome as it needs HPS to configure DRAM bridge and bitstream must be programmed separately due to VIP evaluation. The following steps are required in order to get it running:

1. HPS bootloader image needs to be first written (raw mode) to a micro SD card.
2. DE10-Nano is then powered up with the SD card inserted
3. Time-limited bitstream is next programmed to FPGA. USB cable should be left connected and the popup window open
4. HPS must be then reset with the centermost button of the triplet on DE10-Nano PCB
5. Now DRAM interface should be functional, and scaler mode can be selected on Output opt. Scaler settings under scaler menu are then effective.

It's a good idea to make sure a capable PSU is used since I spent hours debugging a stability issue that was caused by a 2A PSU which I've been using with OSSC Pro without issues. The build is still work in progress but has quite a few features. Some notes on scaler functionality:

* 1920x1080p output mode has the largest amount of framelock presets implemented currently
* Framelock status is indicated by 4 user LEDs closest to FPGA. If suitable PLL preset is not found for input/output combination, operation falls back to free-running (i.e. no framelock)
* If framelock is disabled manually and aspect is set to any other than 1:1 source PAR, input mode switching does not interrupt output
* Scaling algorithm should be set to Nearest for 2D content
* Scaler mode does not yet have as low latency as should in framelocked mode. It seems that framebuffer module would need to be custom-designed for that, or perhaps separate VIP writer and reader plus some logic to sync them optimally could be used.
* Many console-specific low-res optimal modes do not work due to a bug in VIP. SNES 512col is one of the few that seems to be OK

6t8k · Post by **6t8k** » Wed Apr 07, 2021 8:12 pm

Thanks a lot, that's indeed a considerable assortment of features already! In accordance with the steps mentioned, the vip_test branch is up and running on my DE10-Nano, I'll have a closer look on it soon. Thanks as well for the advice with regards to the PSU – a 4A one is already ordered just in case.

6t8k · Post by **6t8k** » Sun Apr 11, 2021 5:08 pm

I've now had some time to look into it, I tried to touch on a broader range of topics to get an overview for now. Below are the main points from my notes, feel free to take up any aspect.

About the output timings, I've found that 1600x1200 (60Hz) and 1280x1024 (60Hz) neither work with my monitor nor with my capture card (both show 'no signal'), regardless of framelock status. Both accept the 1600x1200 signal that cps2_digiav outputs, and I believe 1280x1024 shouldn't be a problem for both, so I'm drawn towards the thought that there might be something wrong with them, but I didn't further look into it yet. 1920x1200 works with the capture card, with the monitor it so far only cooperated with disabled framelock (while the monitor accepts 1920x1200 from the OSSC Classic at various refresh rates). The remaining timings implemented so far, namely 720x480 (60Hz), 720x576 (50Hz), 1280x720, 1920x1080, 1920x1440 and 2560x1440 all work (the latter two I could only test with the capture card as they surpass the capabilities of the monitor).

On the motion-adaptive deinterlacer, considering it only uses a retrospective framebuffer, introducing only around 2 lines of latency, the result is indeed quite good as far as I can judge. I've compared it to the Framemeister's based on a couple examples: for 3D games, a well configured FM looks smoother and sharper overall, the vip_test branch currently has a slightly more grainy appearance in comparison, but this is mostly caused by differences in the polyphase scaler, not in the deinterlacer. In general, the tendency for deinterlacing artifacts appears to be quite similar based on what I've seen so far, but I haven't yet tried any frame-rendered (as opposed to field-rendered) games that have interlaced output. Perhaps there are some specific games/scenes or stress tests too where a comparison would be worthwhile – if you have any suggestions, feel free to mention them. The only thing that stands out a bit to me currently is that with vip_test, bobbing artifacts can sometimes get quite obvious when navigating menus (the FM masks bobbing artifacts a bit better, see below), so it would be nice if the computed motion could be made to decay a touch faster, but it looks like this aspect is baked into the IP core without the possibility to modify it.

Here are a few videos, the lossy (but still high quality) encodings are also on YT (loses a bit more detail): vip_test, Framemeister
If there's anything notable that I've missed (which I'm almost certain of), please do chime in! With DE10-vd_isl, there's sometimes a bit of shakiness to parts of the picture. This is because of some clock problem my vd_isl board seems to have developed, which I haven't really debugged yet. I'm sorry about that, but you should be able to ignore it for the purposes here.

I'd be able to test the 'High Quality' motion-adaptive deinterlacer that utilizes sobel edge interpolation. Considering there was talk in the OSSC Pro thread about an alternative firmware that could provide it by means of jettisoning other functionality, or an OSSC Pro variant with a beefier FPGA (due to the IP core eating up quite a lot of FPGA resources), a comparison might make sense (I'm certainly curious myself). However, it might be worthwhile to check beforehand to what extent the performance of the 'standard' motion-adaptive deinterlacer could be improved by tuning it via its motion shift and -scale registers (section 15.13.2 in the VIP user guide).

As for the scaler, vip_test's polyphase one appears smoother for 2D games as the FM's emphasis on edges gets very obvious there. vip_test's nearest-neighbor scaler expectedly delivers the most accurate (still) picture, but depending on the ratio between the source and target pixel dimensions it can cause visible shimmering. The FM's more 'aggressive' scaling at times almost reminds me of something in the mould of hqnx and the like, but it causes more ringing (which distorts the original picture information more but also increases perceived sharpness). It helps the FM mask the deinterlacing artifacts a bit better, which applies mostly to the bobbing that affects moving parts of the image. Here are a few lossless screenshots comparing vip_test to the FM: 1, 2, 3, 4, 5, 6, 7

Although it's a matter of taste in the end, the result(s) generated by a Scaler II core can be tweaked by feeding it with custom filter coefficients. Generally speaking, for best results, the coefficients used should be conditioned on the scaling factor, and could be calculated as needed using the soft-CPU. And maybe someday there could be a WYSIWYG editor that lets users customize the look of the scaler and generate an appropriate coefficients file, which could then be put onto an SD card and loaded into the scaler core at runtime using the soft-CPU. On that note, I'm wondering if the NN-Scaler block is needed at all in the long term, since (in theory) polyphase scaling is a generalization of nearest-neighbor scaling, so with the right set of coefficients, a polyphase scaler should be able to assume that task as well.

Further, here's a demonstration of the seamless mode switch feature:
vip_test: please ignore the "Switched to ... at"-times, I simply pressed A at an arbitrary point in time to advance the switching – what you see is what I saw.
Framemeister for comparison: "Switched to ... at"-times are about corrent, the capture card was faster than the monitor.

Finally, these are the bugs I've encountered so far:
* When AR is 1:1 source PAR, and the input mode is switched, there's a chance that the output gets corrupted into noise, see video (what's also noteworthy is that seamless mode switch does work in combination with 1:1 source PAR when framelock is disabled manually)
* When the deinterlacer is in weave mode, and the input mode is switched to 288p, the progressive signal is 'deinterlaced' instead of passed through, see video
* When the signal source is switched off, the last frame remains on screen indefinitely (sometimes corrupted/truncated, depending on settings), maybe because the frame buffer stalled.

I'll perform some more tests with different input signals soon.

marqs wrote:* Scaler mode does not yet have as low latency as should in framelocked mode. It seems that framebuffer module would need to be custom-designed for that, or perhaps separate VIP writer and reader plus some logic to sync them optimally could be used.

Could it be the case that the Frame Buffer II always operates in triple buffering mode currently, even when framelocked? In Platform Designer I seemingly can't specify whether it should be a double or a triple buffer at compile time; the introduction to chapter 16 and section 16.6 in the guide read like the core only implements double buffering when frame dropping and frame repeating are disallowed, but both are allowed currently.

marqs · Post by **marqs** » Sun Apr 11, 2021 8:22 pm

Thanks for testing it out. A few quick comments:

* Do 1600x1200 (60Hz), 1280x1024 (60Hz) and 1920x1200 work in Adaptive LM mode? It should use same framelock/output parameters, so if they work then there's probably just some issue on scaler preset selection which has been added recently
* There should be enough unsed logic on the FPGA to enable HQ motion adaptive deinterlacer, but block RAM is getting tight. Adaptive LM currently uses block RAM instead of external RAM so reducing its line buffers is an easy way to free block RAM if needed.
* HQ motion adaptive deinterlacer might not meet timing at current 125MHz, but it can be reduced if you're just testing 480i/576i content and not using 2560x1440 output. In principle 75MHz should be enough for deinterlacing 1080i, but for some reason at least on Pro I've had to run processing chain at 150MHz (2 pixels parallel) to do that succesfully even though it should need only half of that throughput.
* about progressive signal getting deinterlaced, it might be due to the CVI IP expecting fields in certain way. Now that I remember, constant even field indicator caused some issues so I should make sure progressive video is sent to VIP always with constant odd field.
* I think the largest limitation in Frame Buffer II is that it starts to read buffer only after a frame has been fully written on it. The triple buffer configuration (which is inherently needed for non-framelock) would be just fine if it was possible to start reading of the buffer while it's still written. Exact timing would depend on scaling parameters and would be calculated like in adaptive LM mode.

6t8k · Post by **6t8k** » Mon Apr 12, 2021 11:19 pm

Thank you for your comments.

marqs wrote:* Do 1600x1200 (60Hz), 1280x1024 (60Hz) and 1920x1200 work in Adaptive LM mode? It should use same framelock/output parameters, so if they work then there's probably just some issue on scaler preset selection which has been added recently

With the same game at 240p using SCL mode, or using ALM mode with the game at 240p or 480i, everything works as expected.

* There should be enough unsed logic on the FPGA to enable HQ motion adaptive deinterlacer, but block RAM is getting tight. Adaptive LM currently uses block RAM instead of external RAM so reducing its line buffers is an easy way to free block RAM if needed.

Thanks for the advice. For now, when compiling the HQMA variant of the deinterlacer core, 60% of total logic space, 66% of total memory bits, 97% of total RAM blocks and 91% of total DSP blocks are utilized according to fitter summary (when compiling with the standard quality MA deinterlacer core, these numbers amount to 41/64/94/51%, respectively). Programming the thus newly generated time-limited .sof, the deinterlacing works, but it looks exactly like the standard quality MA deinterlacing, I think. All I did to that end was changing the deinterlacing algorithm of the core to the HQ variant in Platform Designer, regenerating the qsys output files, and starting a new complete compilation – I might have missed something.

* I think the largest limitation in Frame Buffer II is that it starts to read buffer only after a frame has been fully written on it. The triple buffer configuration (which is inherently needed for non-framelock) would be just fine if it was possible to start reading of the buffer while it's still written. Exact timing would depend on scaling parameters and would be calculated like in adaptive LM mode.

Oh, I see now, yes...
But to hark back to my original point, couldn't it be a cleaner solution to use only double buffering for framelocked mode (since no frame rate conversion is needed), as it makes impossible the asynchronous/nondeterministic behavior that comes with the triple buffering? In fact I'm wondering what the reasons are for using the frame buffer during framelocked operation... couldn't it be bypassed in that case?

6t8k wrote:* When AR is 1:1 source PAR, and the input mode is switched, there's a chance that the output gets corrupted into noise, see video (what's also noteworthy is that seamless mode switch does work in combination with 1:1 source PAR when framelock is disabled manually)

Addendum: the bug only seems to happen when framelock is on, and doesn't seem to happen with slower timings like 720x480 (60Hz). I was also able to trigger it by switching from 2560x1440 to 720x480 (60Hz) while AR was set to 1:1 source PAR. A clocking/metastability issue? In that latter scenario with framelock off, seamless mode switch only works ~95% of the time, at least with my equipment. Another thing I encountered is that when AR is at 1:1 source PAR, 1920x1440 and 2560x1440 show an all-black picture even though the signal source is switched on etc, but it's probably just not (yet) implemented.

6t8k · Post by **6t8k** » Sat Apr 17, 2021 12:28 am

1080i deinterlacing test (standard-quality motion adaptive deinterlacer) using Gran Turismo 4 on the PS2: YouTube
I'm only getting a gray screen so far, which I cannot fix by changing any option in the Scaler opt. menu. Input preset is 1080i_60 and output mode is 1080p_50 while the framelock status LED group is lit. One time after the gray screen when going back to ALM mode and then to SCL mode again, the picture froze after a second or so (at 480i input). Resetting HPS got SCL mode back to working order.

marqs · Post by **marqs** » Sat Apr 17, 2021 6:59 pm

6t8k wrote:Thank you for your comments.

marqs wrote:* Do 1600x1200 (60Hz), 1280x1024 (60Hz) and 1920x1200 work in Adaptive LM mode? It should use same framelock/output parameters, so if they work then there's probably just some issue on scaler preset selection which has been added recently
I retested this, here is a slightly corrected and more detailed account:

With the same game at 240p using SCL mode, or using ALM mode with the game at 240p or 480i, everything works as expected.

I checked this and found a couple bugs. They are now fixed in the latest image although framelock presets for 480i->1200p have not been added yet. I'm planning to add those and a few additional framelock modes such as "On (2x Hz)", "Off (nearest Hz)", "Off (100Hz)" and "Off (120Hz)" in near future.

6t8k wrote:Thanks for the advice. For now, when compiling the HQMA variant of the deinterlacer core, 60% of total logic space, 66% of total memory bits, 97% of total RAM blocks and 91% of total DSP blocks are utilized according to fitter summary (when compiling with the standard quality MA deinterlacer core, these numbers amount to 41/64/94/51%, respectively). Programming the thus newly generated time-limited .sof, the deinterlacing works, but it looks exactly like the standard quality MA deinterlacing, I think. All I did to that end was changing the deinterlacing algorithm of the core to the HQ variant in Platform Designer, regenerating the qsys output files, and starting a new complete compilation – I might have missed something.

Perhaps the default motion shift and scale values should be tweaked for that to make a difference. I could add respective options on menu with a flag that enables visualization of motion vectors.

6t8k wrote:But to hark back to my original point, couldn't it be a cleaner solution to use only double buffering for framelocked mode (since no frame rate conversion is needed), as it makes impossible the asynchronous/nondeterministic behavior that comes with the triple buffering? In fact I'm wondering what the reasons are for using the frame buffer during framelocked operation... couldn't it be bypassed in that case?

Framebuffer could be indeed bypassed, but then CVO FIFO (composed of block RAM) would need to be very large to account varying input/output mode combinations, especially once zooming is implemented. Nondeterministic behavior of the triple buffer could probably be eliminated with locked mode setting on existing implementation, but latency would still be at least 1 frame. Regardless of whether it is a double or triple buffer, we'd need a way to deterministically place read pointer only as far from the write pointer as needed (while is less than 1 frame in majority of cases, rotation perhaps being the largest exception).

6t8k wrote:1080i deinterlacing test (standard-quality motion adaptive deinterlacer) using Gran Turismo 4 on the PS2: YouTube
I'm only getting a gray screen so far, which I cannot fix by changing any option in the Scaler opt. menu. Input preset is 1080i_60 and output mode is 1080p_50 while the framelock status LED group is lit. One time after the gray screen when going back to ALM mode and then to SCL mode again, the picture froze after a second or so (at 480i input). Resetting HPS got SCL mode back to working order.

On the latest image I also bumped VIP frequencies so there's a good chance 1080i deinterlacing is working now.

6t8k · Post by **6t8k** » Sat Apr 17, 2021 7:40 pm

Thanks for the update, I'll check it out soon.

I neglected to think about the implications bypassing the framebuffer would have on block RAM usage; the scaled frames of course still need to be projected onto the output timing and a clock domain crossing must take place. A custom frame buffer implementation as you wrote seems like a better solution (it's not really clear to me what you meant by 'separate VIP reader/writer'). It's unfortunate that the pixel FIFO in the CVO core can't be configured to use external RAM instead it seems.

marqs wrote:Perhaps the default motion shift and scale values should be tweaked for that to make a difference. I could add respective options on menu with a flag that enables visualization of motion vectors.

I was about to try that, just didn't find the time yet - probably this weekend.

6t8k · Post by **6t8k** » Sun Apr 18, 2021 9:38 pm

With the increased VIP working frequency, 1080i deinterlacing works beautifully now: YouTube
The 60Hz input is still converted to 50Hz output despite framelock being on in the menu and the framelock status LED group being lit (using 1920x1080 50-120Hz).

I discovered that when changing the audio sampling format option while the input signal is at 1080i, regardless of the mode of operation currently in effect, the HDMI signal seems to collapse ('no signal' with both monitor and capture card), and only seems to reappear once the input signal changes to a slower timing like 480i. This happens with the Apr 06 version already, I didn't check earlier ones. It could be a vd_isl issue.

Testing the different SCL mode output resolutions again with Fantasy Zone II DX 480i mode on the PS2, all seem to work properly now except 1920x1200 with framelock on and 1920x1200 with framelock off (50Hz) (results for the 1920x1200 line in the table above are unchanged). The output resolutions in the menu now always specify either a refresh rate or a refresh rate range, clarifying individual framelock support in principle. These are currently:

720x480 (60Hz)
720x576 (50Hz)
1280x720 (50-120Hz)
1280x1024 (60Hz)
1920x1080 (50-120Hz)
1600x1200 (60Hz)
1920x1200 (50-60Hz)
1920x1440 (50-60Hz)
2560x1440 (50-60Hz)

I've added menu options for the 'Visualize Motion Values' and 'Motion Shift' registers, the 'Motion Scale' register is only available via the 'set B' registers, which are in place only if the Deinterlacer II core is upgraded via its compile-time configuration to support cadence detection with video-over-film and HQ motion adaptive deinterlacing. So before doing that and adapting the SW implementation to accommodate register set B, I'll first examine what improvements could be achieved just by tuning the 'Motion Shift' register for both standard and HQ motion adaptive deinterlacing.

marqs · Post by **marqs** » Mon Apr 19, 2021 7:46 pm

6t8k wrote:The 60Hz input is still converted to 50Hz output despite framelock being on in the menu and the framelock status LED group being lit (using 1920x1080 50-120Hz).

I discovered that when changing the audio sampling format option while the input signal is at 1080i, regardless of the mode of operation currently in effect, the HDMI signal seems to collapse ('no signal' with both monitor and capture card), and only seems to reappear once the input signal changes to a slower timing like 480i. This happens with the Apr 06 version already, I didn't check earlier ones. It could be a vd_isl issue.

The framelock config selection issue is now fixed. I'll try to see if I can reproduce the audio sampling format change problem.

6t8k wrote:Testing the different SCL mode output resolutions again with Fantasy Zone II DX 480i mode on the PS2, all seem to work properly now except 1920x1200 with framelock on and 1920x1200 with framelock off (50Hz) (results for the 1920x1200 line in the table above are unchanged). The output resolutions in the menu now always specify either a refresh rate or a refresh rate range, clarifying individual framelock support in principle.

When falling back from framelock, the system doesn't currently select the closest vertical frequency for free-running mode but settles to first valid preset starting from 50Hz. That probably explains your issue with 1920x1200 with 480i input, i.e. it falls back to 1920x1200_50. Not sure why your monitor doesn't support that mode, though, but it might be just a compatibility issue. The listed refresh rate ranges actually apply to both framelocked and freerunning modes - they just give an indication to user on which refresh rates are feasible with different input/framelock parameters. For example, selecting "Off (50Hz)" for framelock with 720x480 (60Hz) output mode does not make sense and results to "Off (60Hz)" getting applied under the hood.

6t8k · Post by **6t8k** » Thu Apr 22, 2021 9:26 pm

All indications are that setting the deinterlacing algorithm parameter of the Deinterlacer II core in its compile-time config to 'Motion Adaptive High Quality', based on the current configuration, without anything else (if register set A is used), yields the same results as the standard 'Motion Adaptive' configuration. I cannot rule out that the cadence detection is different as I did not test it (at any rate I'd have expected motion/edge detection results to be different). I determined this by frame-stepping through lossless recordings of each configuration's output, each using various motion shift values and various games, both with motion visualization on and off.

Below are two exemplary pairs of lossless screenshots, each showing the same frame out of a fixed sequence, comparing both configurations while motion visualization is on. The scaling algorithm is set to nearest neighbor everywhere to reduce the scaling's impact on the deinterlacer's result:

Maximo demo serquence, motion shift is set to 2:
Motion Adaptive
Motion Adaptive High Quality

SNK vs. Capcom: SVC Chaos intro movie, motion shift is set to 3:
Motion Adaptive
Motion Adaptive High Quality

Motion shift can be set to an integer value within the range 0..7, with lower values causing more motion to be detected in more parts of the image - the more motion, the more bob instead of weave is applied. According to the user guide, the value should be chosen such that no motion is detected in static parts of the image. The default value the core comes with is 3. Ideally, in theory, we'd like to use a value that is maximally sensitive to moving parts of the image while not detecting any motion in static parts of the image.

With 0, there is always significant motion detected across the whole image, regardless of whether it (or parts of it) are moving or not, and no matter which scene: [screenshot]
With 1, albeit less pronounced, the noise floor is still there: [screenshot]
With 2, the noise floor vanishes and no motion is detected in static parts of the image anymore: [screenshot]
Higher values then (further) decrease the sensitivity to moving parts of the image, which increases the tendency for combing (weave) artifacts to appear.

Here are a few lossless screenshots of Maximo's main menu while in motion, shot at the same position, but with different motion shift values: set to 2 | set to 3 | set to 4
Similarly for OutRun 2006, although speed/position are not exactly the same: set to 2 | set to 3

Given these results, setting the default motion shift value to 2 looks like the best choice to me (regarding register set A), but if there are other opinions I'd love to hear them as well, and as always: if anybody would like to see something specific tested, let me know and I'll see what I can do if it seems worthwhile. I've provided some more videos here which could be used as further reference.

I also have the HQ configuration with register set B running, more on that soon.

---

marqs wrote:The listed refresh rate ranges actually apply to both framelocked and freerunning modes - they just give an indication to user on which refresh rates are feasible with different input/framelock parameters.

So the resolutions with concrete refresh rates given next to them don't support framelock and are always free-running at the respective rate (regardless of the framelock menu option). The resolutions with ranges next to them support framelock in principle within the given range, but might not (currently) be implemented for the specific input/output combination used (or another target rate, like 2x Hz) - and also support non-framelock in principle: within the given range, but might not (currently) be implemented for a another target rate like nearest Hz or 100Hz for example. Did I get the general idea right?

I missed that 1920x1200_50 does not work with the monitor with 240p input also *shakes head in disbelief* and the monitor must have some compatibility issue with 1920x1200_50.

marqs wrote:
6t8k wrote:The 60Hz input is still converted to 50Hz output despite framelock being on in the menu and the framelock status LED group being lit (using 1920x1080 50-120Hz).
The framelock config selection issue is now fixed.

Can confirm that this works now.

Harrumph · Post by **Harrumph** » Sat Apr 24, 2021 10:09 pm

Just wanted to give a shoutout to 6t8k, good job on your efforts and thorough documentation, I’m sure it’s a big help to Marqs.

marqs · Post by **marqs** » Sun Apr 25, 2021 9:23 pm

6t8k wrote:So the resolutions with concrete refresh rates given next to them don't support framelock and are always free-running at the respective rate (regardless of the framelock menu option). The resolutions with ranges next to them support framelock in principle within the given range, but might not (currently) be implemented for the specific input/output combination used (or another target rate, like 2x Hz) - and also support non-framelock in principle: within the given range, but might not (currently) be implemented for a another target rate like nearest Hz or 100Hz for example. Did I get the general idea right?

The resolutions with a single refresh rate displayed also support framelock on that rate within some margin. Framelock presets (which have been streamlined in latest fw) are inherently PLL configurations which define parameters for converting input sampling clock to output pixel clock. Ideally the configurations would be generated run-time, but Silabs doesn't provide a C API for the clock generator parametrization so they are currently generated offline and turned into presets. One preset essentially maps a sampling configuration (clk, h_total, v_total being the relevant parameters) into an output mode (timings except pixel clock being fully standard-compliant), and it is functional for the expected refresh rate (as defined during generation) with maybe +-25% margin.

In the latest firmware non-framelock 100Hz and 120Hz options have been added which are usable for 720p and 1080p although the latter may not work on DE10-Nano due to hardware limitations. There is now also a "2x Hz" option for framelock, but I've only added respective PLL configs for 480i generic mode.

6t8k · Post by **6t8k** » Thu Apr 29, 2021 7:37 pm

With the HQ motion adaptive deinterlacing configuration using register set B (HQMAD/B), FPGA resource utilization on the DE10-Nano is now at 66% of total logic space, 67% of total memory bits, 100% of total RAM blocks and 100% of total DSP blocks according to fitter summary. To make the changed configuration work, I only had to move the address base of the associated run-time control inferface as the doubled address space size would have overlapped with others.

Motion shift works just like with register set A, but the noise floor only vanishes once the value is set to 3 instead of 2: set to 0 | set to 1 | set to 2 | set to 3
The value is therefore always set to 3 in the following screenshots.

Per the user guide, the core multiplies the detected amount of motion with the value on the motion scale register according to the following formula: scaled motion = detected motion * (motion scale / 32), the bob/weave decision is then based on the scaled motion quantity. Motion scale is an integer within the range 0..255, the default being 125. To tune the value, it should be gradually decreased until weave artifacts can be seen (if none can be seen), and then gradually increased until all weave artifacts disappear. With all games I tested so far, (minor) weave artifacts can already be seen with the default value of 125. Increasing the value from 130 to 131, there is a sudden change in behavior (set to 130 | set to 131); all values above 130 seem to give rise to more weave artifacts than 125, 131 much more so, 255 just a little more. In theory weave artifacts could be reduced by setting motion scale to 130 instead of 125, in practice I've found the difference to be negligible:
Set to 125: Maximo | OutRun 2006
Set to 130: Maximo | OutRun 2006 | Maximo, standard-quality MA deinterlacing, motion shift set to 2

I can still see weave artifacts in full motion when motion scale is set to 130, but making all of them disappear would probably be too high of an expectation anyway. All in all, testing with Maximo, OutRun 2006, SNK vs. Capcom: SVC Chaos, Katamari Damacy and Dragon Quest VIII (the latter two are frame-rendered), my impression is that the defaults are a good choice already.

Inspecting Maximo's demo sequence again while visualizing motion - motion shift and -scale at their defaults - there are now differences which go beyond just analog noise (it's notable here that the fog near the ground seems to be randomized by the game), though it's probable that the sobel filter's impact is only visible in the end result and not through motion visualization. It mostly comes down to the core also looking 1 field forward now, which becomes most apparent considering e.g. the sword or the coins:
[1-HQ] [1-SQ] | [2-HQ] [2-SQ] | [3-HQ] [3-SQ]

Comparing OutRun 2006 using HQMAD/A, nearest neighbor scaling and motion shift set to 2 (video) side-by-side to HQMAD/B with motion shift set to 3 and motion scale set to 130 (video), I can't really spot any improvements to speak of with the latter. Any differences would further be attenuated by any scaling filter that's more smooth than nearest-neighbor, to that end my original OutRun 2006 recording using the standard-quality MA deinterlacing with Lanczos3 scaling and motion shift set to 3 - which should even increase the difference - (video) could be compared to a HQMAD/B recording with motion shift at 3 and motion scale at 130 while using Lanczos3 as well (video), which leads me to the same basic verdict.

So far I only tuned the motion shift and motion scale registers, although register 12, which enables the cadence detection and video over film features and accompanying registers, has to be set to 1 for changes to the motion shift register to take effect. In other words, I didn't really look into the cadence detection and video over film features and related registers yet as they shouldn't be needed for games - perhaps most benefits of the HQ motion adaptive deinterlacing config unfold when using those features. Based on polling register 3 'cadence detected', with the games I tested so far, no cadence was ever detected regardless of register 12's value. One drawback of using register set B is that there seems to be no easy way to facilitate pure bob or pure weave deinterlacing. Pure weave could maybe be achieved by setting motion scale to 0, but I currently see no way to achieve pure bob.

Assuming that the selection of games I've tested so far stands for fairly representative content, in summary, it doesn't look like the increase in FPGA resource usage, added latency and pure bob / weave limitation would be worth the benefits with respect to games.

---

Harrumph wrote:Just wanted to give a shoutout to 6t8k, good job on your efforts and thorough documentation, I’m sure it’s a big help to Marqs.

Thank you for the kind words ^-^

marqs wrote:The resolutions with a single refresh rate displayed also support framelock on that rate within some margin. Framelock presets (which have been streamlined in latest fw) are inherently PLL configurations which define parameters for converting input sampling clock to output pixel clock. Ideally the configurations would be generated run-time, but Silabs doesn't provide a C API for the clock generator parametrization so they are currently generated offline and turned into presets. One preset essentially maps a sampling configuration (clk, h_total, v_total being the relevant parameters) into an output mode (timings except pixel clock being fully standard-compliant), and it is functional for the expected refresh rate (as defined during generation) with maybe +-25% margin.

Interesting, thank you for the clarifications! Referencing AN619 it should be possible to generate the required register configurations during run-time. The question would be whether the effort and increased code footprint would be worth it I suppose, I currently don't have a good overview over how many (more) framelock presets would be needed to cover all expected input/output combinations. But doing so should make matching the output refresh rate to the input refresh rate in unlocked mode as closely as possible more practical (for optimal results regarding seamless mode switch or ironing out issues in the source sync signal), and it'd be nice to eventually get rid of the ClockBuilder dependency which I assume you're using.

marqs wrote:In the latest firmware non-framelock 100Hz and 120Hz options have been added which are usable for 720p and 1080p although the latter may not work on DE10-Nano due to hardware limitations. There is now also a "2x Hz" option for framelock, but I've only added respective PLL configs for 480i generic mode.

In a quick test, these and the 'Off (closest Hz)' option that you've also added as previously announced seem to work well, except as expected 1080p@100Hz and 120Hz.

Edit, a few examples for the 'Off (closest Hz)' framelock option: it currently elicits 50Hz unlocked output when the '1920x1080 (50-120Hz)' output resolution is selected during the PAL PS2 system menu which according to DE10-vd_isl comes in at 49.99Hz, and likewise SNK vs. Capcom: SVC Chaos's refresh rate selection screen coming in at 49.76Hz. It elicits 60Hz unlocked with '1920x1080 (50-120Hz)' after choosing the 60Hz option in SVC Chaos which makes the console output fields at 59.93Hz. Selecting e.g. '1280x1024 (60Hz)' in these cases, 60Hz unlocked gets applied throughout while selecting '720x480 (60Hz)' leads to 59.94Hz unlocked (1000/1001 factor due to the NTSC origin) being applied throughout. On that note, the info/stats board now also detailing the current input/output refresh rates (the latter wasn't exposed at all before) is also appreciated.

marqs · Post by **marqs** » Mon May 10, 2021 11:14 pm

6t8k wrote:Assuming that the selection of games I've tested so far stands for fairly representative content, in summary, it doesn't look like the increase in FPGA resource usage, added latency and pure bob / weave limitation would be worth the benefits with respect to games.

That is indeed the case.

6t8k wrote:Interesting, thank you for the clarifications! Referencing AN619 it should be possible to generate the required register configurations during run-time. The question would be whether the effort and increased code footprint would be worth it I suppose, I currently don't have a good overview over how many (more) framelock presets would be needed to cover all expected input/output combinations. But doing so should make matching the output refresh rate to the input refresh rate in unlocked mode as closely as possible more practical (for optimal results regarding seamless mode switch or ironing out issues in the source sync signal), and it'd be nice to eventually get rid of the ClockBuilder dependency which I assume you're using.

It is possible to generate fractional PLL configurations at runtime like is already done for integer-mode PLL configurations, but the former is more complicated due to prime factorization etc. Increased code footprint is a non-issue as .text and .rodata can be easily moved into flash, or even have everything executed from DRAM if necessary. For now making presets via ClockBuilder is still somewhat tolerable, but in a long run I'd indeed prefer getting rid of it.

The latest image enables edge-adaptive configuration for scaler and allows setting filter coefficients at runtime. Filter tap length is reduced to 4 to work around temporary block RAM limitations while also making it possible to easily try various 4-tap 16-phase filters found here.

marqs · Post by **marqs** » Sun Oct 10, 2021 8:30 pm

An updated firmware for DE10-Nano is on github that has latest changes integrated from OSSC Pro. All functionality including scaler mode now works directly from the SD card image.

There are also plans to manufacture a bunch of these add-on boards since people have been asking them. After all it's a low-risk, low-cost product which should be useful for anyone who owns cne of the compatible development boards.

Insert Disk Two · Post by **Insert Disk Two** » Fri Jan 14, 2022 5:48 pm

Does this support:

- S-Video in
- 240p x5 on 1080p (cropped)

Thanks!

bonzo.bits · Post by **bonzo.bits** » Fri Jun 03, 2022 2:46 am

Looking for a downscaling solution so I can play PC games on a 15 Khz CRT. Does this board have the 480p > 240p downsclaing implemented?

If so, were any boards produced that can be purchased, or do I got the DIY route using the resources on github? If the latter, I am familiar with ordering from a board producer but need to know which files I provide to them. Thanks!

marqs · Post by **marqs** » Fri Jun 03, 2022 8:33 am

bonzo.bits wrote:Looking for a downscaling solution so I can play PC games on a 15 Khz CRT. Does this board have the 480p > 240p downsclaing implemented?

480p -> 240p is currently implemented as "line drop" in line multiplier mode. More generic downscaling (i.e. any->any) is available in scaler mode but currently its lowest output mode is 480p. It shouldn't be a large task to add 240p & 288p there too, though.

bonzo.bits wrote:If so, were any boards produced that can be purchased, or do I got the DIY route using the resources on github? If the latter, I am familiar with ordering from a board producer but need to know which files I provide to them. Thanks!

The first batch of preassembled boards should be available in 1 month. At current market situation it'd hard to go DIY route since so many things are out of stock. Speaking of DIY, the latest PCB revision (v1.4) has a few modifications that improve signal integrity compared to older revisions and the pinout has been slightly changed. Those who have built v1.3 boards should apply the same improvements which are also required for them to be compatible with current firmware - I'll need to add instructions on the github page.

marqs · Post by **marqs** » Fri Jun 03, 2022 3:33 pm

Insert Disk Two wrote:Does this support:

- S-Video in
- 240p x5 on 1080p (cropped)

S-Video is not currently supported. 240p can be line-multiplied 4x or 5x into 1080p output or freely scaled using polyphase scaler.

rthorntn · Post by **rthorntn** » Fri Jun 10, 2022 6:42 am

Hi Marqs,

Am I able to reserve a board from the first batch of preassembled boards?

Thanks.
richard

Nrg · Post by **Nrg** » Fri Jun 10, 2022 6:15 pm

I'm interested in 1x preassembled board aswell!

Dexje · Post by **Dexje** » Mon Jun 13, 2022 3:45 pm

great project

would it be possible to connect a vga source with a vga to scart cable as the addon board only has a scart connector?

shmups.system11.org

DExx-vd_isl video digitizer add-on for Intel FPGA dev boards

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo

Re: DExx-vd_isl video digitizer add-on for Intel FPGA dev bo