Real-time Video Scaling | |||
---|---|---|---|
Scaling for real-time video streams has more constraints than for still images that are processed off-line. The number one constraint is the need to process each frame in a given amount of time. The second main constraint is the handling of the interlaced frames. Because of the speed issues and the fact that video scaling by definition contains no rotations, more intensive Spline and Sync interpolation is not needed. At the other end of the processing spectrum is a straight merging of the even and odd lines of two consecutive video fields. This is not usually the preferred method because motion between the 1/30th of second that the fields are captured will create a "feathering effect" on the de-interlaced video output. See Figure 1. A well-established median method for video scaling is the use of Bilinear Interpolation. Bilinear InterpolationBilinear Interpolation is the process of using each of the intermediate fields in an interlaced video frame to generate a full size target image. Either all the odd or all the even lines on the field are used. Interpolations are then performed between the lines and between adjoining pixels to generate an entire non-interlaced frame for the progressive scan output. With video scaling, the desired amount of stretch in the X and Y directions is rarely a simple power of two. See Figure 2. Thus, a quick interpolation between the source pixels is not feasible. Instead, a weighted coefficient method is typically used where the target pixel becomes the linearly interpolated value between adjacent points that are weighted by how close they are spatially to the target pixel. See Figure 3. For a moderate cost in hardware, the weighted bilinear interpolation has the following benefits:
Sensoray's model 2246 HDTV Frame Grabber and Display processes multiple video inputs with many input sizes. By using several identical instances of a generic bilinear interpolator, the 2246 is able to scale any input to any of the output targets, plus an MPEG compression target for streaming overlaid video back to host computer for PVR functions. To give an idea of size, the scaling engine used in Sensoray's 2246 uses approximately 15% of Altera's second smallest Stratix II FPGA (the EP2S30). Some of the possible Input to Output scale settings are as follows:
|
|||
Figure 1
|
|||
Figure 2
|
|||
Bilinear Interpolation Example | |||
As an example of Weighted Scaling by a non-multiple number, let's scale 5 lines to 8 lines. Set Step size = (SourceLines - 1) / (TargetLines - 1) = 4/7
EquationsF = 1 *A G = (1-(4/7))*A + (4/7)*B H = (1-(1/7))*B + (1/7)*C I = (1-(5/7))*B + (5/7)*C J = (1-(2/7))*C + (2/7)*D K = (1-(6/7))*C + (6/7)*D L = (1-(3/7))*D + (3/7)*E M = 0 *D + (7/7)*E Repeat for X direction CodeThe following Pseudo code performs arbitrary Bilinear Interpolation. In addition, it shows how the floating-point arithmetic can be handled with integer arithmetic only. X-Direction HSrcStepInt = SrcWidth / TgtWidth; HSrcStepRem = SrcWidth % TgtWidth; for ( CurrentReadLine = 0; CurrentReadLine < SrcHeight; CurrentReadLine++ ) { CurrentWriteLine = CurrentReadLine; HTotStepInt = 2; // pre-load first two pixels while ( NumPixels < HActiveStop) { if ( HTotStepInt != 0) { SourcePix1 = SourcePix2; SourcePix2 = SourceImage[CurrentReadLine][HSrcPosInt]; HSrcPosInt++; HTotStepInt--; } else if ( NumPixels < TgtWidth ) { if ( NumPixels >= HActiveStart && NumPixels < HActiveStop ) { coeff1 = ( TgtWidth - HSrcPosRem ) * 65536 / TgtWidth; coeff2 = HSrcPosRem * 65536 / TgtWidth; TargetLine[CurrentWriteLine][PixelCount] = ( coeff1 * SourcePix1 + coeff2 * SourcePix2 ) >> 16; PixelCount++; } HSrcPosRem += HSrcStepRem; HTotStepInt = HSrcStepInt; if ( HSrcPosRem >= TgtWidth ) { HSrcPosRem -= TgtWidth; HTotStepInt++; } NumPixels++; if ( HTotStepInt != 0 ) { SourcePix1 = SourcePix2; SourcePix2 = SourceImage[CurrentReadLine][HSrcPosInt]; HSrcPosInt++; HTotStepInt--; } } } // While NumPixels } // For CurrentReadLine Y-Direction int VSrcStepInt = SrcHeight / TgtHeight; int VSrcStepRem = SrcHeight % TgtHeight; int VTotStepInt = 1; for ( CurrentReadLine1 = 0; CurrentReadLine1 + 1 < SrcHeight; CurrentReadLine1++ ) { CurrentReadLine2 = CurrentReadLine1 + 1; if ( VTotStepInt != 0 ) { if ( VSrcPosInt < SrcHeight ) { VSrcPosInt++; } } else if ( NumLines < TgtHeight ) { VSrcPosRem += VSrcStepRem; VTotStepInt = VSrcStepInt; if ( VSrcPosRem >= TgtHeight ) { VSrcPosRem -= TgtHeight; VTotStepInt++; } NumLines++; } for ( NumPixels = 0; NumPixels < TgtWidth; NumPixels++ ) { if ( NumPixels < TgtWidth ) { TopPixel = SourceImage[CurrentReadLine1][NumPixels]; BottomPixel = SourceImage[CurrentReadLine2][NumPixels]; coeff1 = ( TgtHeight - VSrcPosRem ) * 65536 / TgtHeight; coeff2 = VSrcPosRem * 65536 / TgtHeight; TargetPixel = ( coeff1 * TopPixel + coeff2 * BottomPixel ) >> 16; } } } // For CurrentReadLine1 ConclusionReal-time video scaling engines based on the Bilinear Interpolation approach can be very modular, fast enough for cross-converting all of the many video standards, and cost effective. |