1. Horizontal sync pulses are not periodic, and frame time is not constant either
2. First visible scanline (#15) preceded by a short scanline is offset left by one dot since clipping of horizontal sync period is postponed to the end of the visible scanline. When short scanlines occur every other frame, pixels on the first visible scanline are jittery
In the days of CRT, those didn't cause really any problems. Slight sync jitter didn't disturb scanning, and first "visible" scanlines were masked by overscan, hiding any inconsistencies. However, with modern digital systems things are a bit different. PLLs in video digitizers use horizontal sync as their reference and assume it to be perfectly periodic. If not, sampling clock starts to jitter (typically stabilizes after a few equal-length scanlines) and PLL may lose its lock in the worst case. The problem is prone to get amplified in devices like OSSC which generate their output video clock directly from output of sampling PLL - even if that manages to keep its lock, resulting clock will be jittery and other PLLs/clock circuitry in the chain (e.g. in HDMI receiver) may not be able to cope with it. Even though 1/341th of a line does not sound much, it is enough to upset many TVs and video capture cards (PC displays seem to generally be more tolerant).
For some time I've had an idea for a small modification that compensates the skipped tick in time domain by gating system clock for equal amount of time (4 cycles). Since Nintendo's CPU/PPU should contain only static logic, gating the clock should be OK. I finally got around trying the idea last week on a Famicom + NESRGB combo. MCLK and CSYNC signals were routed to external FPGA which monitors timing and outputs gated clock "GCLK" and jitter-free CSYNC. That seemed to do the trick even though I only forwarded GCLK to PPU and NESRGB with CPU using the original clock for simplicity. In theory, CPU may try access PPU during those gated cycles (resulting to invalid read/write), so a more robust solution would also route GCLK to CPU. Ideally the logic (which is not much) would also be integrated on the CPLD on NESRGB board, so there would be no need for extra HW aside from a couple resistors and wires. The mod solves problem #1, but #2 still remains. Fixing the sync is probably enough for most people (who are fine with ignoring / masking the first visible scanline), but #2 could still be fixable on NESRGB via a line buffer or heuristics.
Below are timing diagrams illustrating the operation:
Normal scanline, 1364 MCLK cycles:

Short scanline, 1360 MCLK cycles. Activates nes_dejitter logic which gates clock for 4 cycles, normalizing horizontal sync period:

Sample videos (nes_jitter initially off, enabled after 3-5 seconds)
Grid
Timing info (if CPU was gated as well, there should be no difference as the mod should be completely transparent to the system then)
Contra (first row demonstrates problem #2)
I'll try the same mod on SNES next. Its PPU is similar so I see no reason why it would not work. The short scanline actually occurs deeper in vblank so problem #2 does not even exist with it. However, apparently its 576i mode uses long (1368-cycle) scanlines which can not be dejittered with this mod, leaving those few playing PAL interlaced games with the jittery sync.
Update 05/2018: Design files for an add-on board and installation instructions for several models are avaialble on Github. Pre-assembled boards are expected to be available on Videogameperfection.com sometime in June.
Update 07/2018: Pre-assembled boards are now available on Videogameperfection.com, both as standalone and with installation service.