How can I encode YUV frames and output an H264 stream?

November 15, 2013, 11:29 am

≫ Next: Bad performance of H264 Hardware Transform

≪ Previous: Async Transform with Multiple Output Streams

So I would like to take in YUV frames, encode them into H264 format (hardware accelerated) and stream them somewhere.

I've spent a couple days looking at the Microsoft Media Foundation and can't figure out how to just take in a single YUV frame at a time or get started.

Been looking through the MSDN documentation as well as "Developing Microsoft Media Foundation Applications" from O'Reilly Media. Just don't know where to get started coding thing, probably something basic I'm missing.

-Matt

↧

Bad performance of H264 Hardware Transform

November 17, 2013, 9:37 am

≫ Next: MFCreateDXGISurfaceBuffer() fails

≪ Previous: How can I encode YUV frames and output an H264 stream?

Hi,

i enumerated all Hardware MFTs with MFTEnumEx in the category encoder and printed their friendly names out. One had the name AMD H264 which to me obviously seems to be their GPU implemented encoder. I was activating it via CoCreateInstance in my class factory and then ran some tests ( of course it is also registered ).

It works and my final video is ok, however the encoding speed is the same as i would use CPU encoding with WMV on Complexity 0. As i meantioned in my other threads i am live encoding backbuffers from games and the system audio. One interessting thing is, if i do not use the AMD H264 Hardware MFT and just set MFVideoFormat_H264 on a sink writer output stream i get exactly the same performance. How so -> ? <- Is the Microsoft H264 implementation automaticaly using the GPU when possible ? But how comes that it is the exact same encoding speed (fps) as WMV with complexity 0 on the CPU ? Am i missing something ? Is it not possible to use the MFT directly and rather it is used by DXVA ?

Maybe the way i am using a Hardware MFT is wrong, for the moment i dont get a picture out of it. If someone has experience using Hardware MFTs in Media Foundation i would appreciate any hint.

regards

↧

MFCreateDXGISurfaceBuffer() fails

November 12, 2013, 12:32 am

≫ Next: How to force Media Foundation to fallback to Directshow when using WMP in windows 7 and above?

≪ Previous: Bad performance of H264 Hardware Transform

Hi,

i tryed to encode a ID3D11Texture2D by using MFCreateDXGISurfaceBuffer and then add the recieved buffer to an IMFSample and send it to the sink writer. My Problem is that the final video, although it has sound and records normal, has only black image.

On the MSDN documentary of MFCreateDXGISurfaceBuffer under community additions Stewart Tootill sounds like it would work as good as MFCreateDXSurfaceBuffer does when encoding a d3d surfaces with the sink writer. The following sample code is simple and very straightforward :

	// Create the DirectX surface buffer
	if(SUCCEEDED(hr))hr = MFCreateDXGISurfaceBuffer(__uuidof(ID3D11Texture2D), pFrameBuffer, 0, false, &pBuffer);
	if(SUCCEEDED(hr))hr = pBuffer->QueryInterface(__uuidof(IMF2DBuffer), (void**)&p2DBuffer);
	if(SUCCEEDED(hr))hr = p2DBuffer->GetContiguousLength(&Length);

    // Set the data length of the buffer
    if(SUCCEEDED(hr))hr = pBuffer->SetCurrentLength(Length);

    // Create a media sample and add the buffer to the sample
    if(SUCCEEDED(hr))hr = MFCreateSample(&pSample);
    if (SUCCEEDED(hr))hr = pSample->AddBuffer(pBuffer);
    // Set the time stamp and the duration
    if(SUCCEEDED(hr))hr = pSample->SetSampleTime(frameStart);
    if(SUCCEEDED(hr))hr = pSample->SetSampleDuration(frameDuration);

    // Send the sample to the Sink Writer
    if(SUCCEEDED(hr))hr = pSinkWriter->WriteSample(videoStream, pSample);

My guess is that it has to do with the creation parameters of the ID3D11Texture2D. Its usage is staging and has cpu read and write access, and when i print pFrameBuffer with another API to a jpg or bmp file directly before

if(SUCCEEDED(hr))hr = MFCreateDXGISurfaceBuffer(__uuidof(ID3D11Texture2D), pFrameBuffer, 0, false, &pBuffer);

the texture is normal. The Format of pFrameBuffer is DXGI_FORMAT_R8G8B8A8_UNORM

Any ideas ?

↧

How to force Media Foundation to fallback to Directshow when using WMP in windows 7 and above?

November 17, 2013, 10:47 pm

≫ Next: How to add multipe audio stream when playing?

≪ Previous: MFCreateDXGISurfaceBuffer() fails

Hi,

I have written a directshow filter in with high merit.

While Plying file through WMP in WinXP it will pick up my filter and put it in graph.

But in case of windows 7 and above while Playing file through WMP Player, My Directshow Filter

does not comes in the picture as it uses Media Foundation.

How can I make Media Foundation to fallback to Directshow, so that it uses Directshow Filter?

Best Regards,

Sharad

↧

How to add multipe audio stream when playing?

November 18, 2013, 1:20 am

≫ Next: RGBA ( source ) to BGRA ( WMF / DSHOW )

≪ Previous: How to force Media Foundation to fallback to Directshow when using WMP in windows 7 and above?

Hi all,

I need to support to play TV stream from TV recorder in mediaElement, so i have developped a mpeg ts source to play TV stream, it works well for almost most of TV program.

But now i find some program have multipe audio stream, some havn't, and the file is much too big, i can't parse all file to get the mutilpe audio stream info. so i think, can i add the audio stream dynamically to topology?

Julis

haha

↧

RGBA ( source ) to BGRA ( WMF / DSHOW )

June 27, 2013, 10:22 am

≫ Next: Basic Playback and MPEG1Source Examples (How to fix "could not open the file" error?)

≪ Previous: How to add multipe audio stream when playing?

Hi,

is there a reason why the uncompressed RGB32 formats in WMF and DSHOW are BGRA in memory while outside there are lot of commonly used RGBA formats like in DXGI ?

I managed to swap channels with some memcpy, _rotr and bitshift operations. However...my live encoding of a backBuffer is droping now 20 Frames instead of my usual 5.

The Problem is that i have RGBA format on the IDXGISurface/ID3D11Texture2D i want to encode with the sink writer.

In D3D9 i am using MFCreateDXSurfaceBuffer and then just add this buffer to a sample and send it to the sink writer. For D3D9 you dont have to do any format conversions, all formats are automaticaly handled by the Microsoft source code internally.

Now...when using MFCreateDXGISurfaceBuffer it seems you have to write all the conversions yourself. The finalized video is complete but only the audio exist in it, the videoframes are only black. I didnt tested all DXGI_FORMATs but it looks like only a very few of them are supported when encoding from a buffer created by MFCreateDXGISurfaceBuffer.

Why is there so well rounded support for Direct3D9, but for DXGI its practicaly none existent ? So if M$ has some format conversions in their source code why is there no mentioning anywhere in the documentary about what formats are supported and what not ?

MFCreateDxgiSurfaceBuffer has 2 main problems :

1) minimum supported client for this function is Windows 8

2) as said its also failing on most DXGI_FORMATs, even on Windows 8 ! ...

Any other way to encode DXGI_FORMATs without manualy shifting the bytes ?

regards

co0Kie

↧

Basic Playback and MPEG1Source Examples (How to fix "could not open the file" error?)

November 18, 2013, 7:35 am

≫ Next: About AAC decoder config

≪ Previous: RGBA ( source ) to BGRA ( WMF / DSHOW )

I have downloaded both the MPEG1Source and the MF_BasicPlayback example. I compile and register the MPEG1Source.dll using Regsvr32 within a command prompt with administrator permissions. Upon running regsvr32, it tells me it successfully registered the dll. I then recompile and run the MF_BasicPlayback example. Upon trying to open a .mpg file that I downloaded from a website telling me it was mpeg1, I recieve a "Could not open the file. (HRESULT = 0xC00D36C4)" error.

This is happening within Windows 7 with Visual Studio 2010. It also happens when I try to use the WavSource example, after having renamed the file name to .xyz

Thoughts? Thank you!

↧

About AAC decoder config

July 2, 2013, 12:28 am

≫ Next: Reopening topologies

≪ Previous: Basic Playback and MPEG1Source Examples (How to fix "could not open the file" error?)

Hi all,

I have developed a MKV media source, it's ok to play video, but audio is not, and i have check aduio parameter, it's correct. This is my config and file info:

WAVEFORMATEX afmt;
WAVEFORMATEX* wfe = &afmt;
ZeroMemory(wfe, sizeof(WAVEFORMATEX));
wfe->nChannels = Channels;
wfe->nSamplesPerSec = SamplingFrequency;
wfe->wBitsPerSample = BitDepth;
wfe->wBitsPerSample = wfe->wBitsPerSample==0?16:wfe->wBitsPerSample;
wfe->nBlockAlign = (WORD)((wfe->nChannels * wfe->wBitsPerSample) / 8);
wfe->nAvgBytesPerSec = wfe->nSamplesPerSec * wfe->nBlockAlign;

int size = sizeof(HEAACWAVEINFO) + CodecPrivate.GetCount();
HEAACWAVEINFO* pWaveInfo = (HEAACWAVEINFO*)malloc(size);
memset(pWaveInfo, 0, size);
memcpy(&(pWaveInfo->wfx), wfe, sizeof(WAVEFORMATEX));

pWaveInfo->wAudioProfileLevelIndication = 0xfe;
pWaveInfo->wPayloadType = 0;//raw aac
pWaveInfo->wStructType = 0;

pWaveInfo->wfx.cbSize = size- sizeof(WAVEFORMATEX);

if(CodecPrivate.GetCount() > 0)
	memcpy((BYTE*)(pWaveInfo + 1),CodecPrivate.GetData(), CodecPrivate.GetCount());

HRESULT hr = MFCreateMediaType(&pType);
if (SUCCEEDED(hr))
{
	hr = pType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Audio);
}

hr = MFInitMediaTypeFromWaveFormatEx(pType, (const WAVEFORMATEX*)pWaveInfo, size);
hr = pType->SetGUID(MF_MT_SUBTYPE, MFAudioFormat_AAC);

free(pWaveInfo);
pWaveInfo = NULL;

General
Unique ID                                : 230449330490480222084075823367818245987 (0xAD5EED37552E8C35B04FFC7FF40AF763)
Complete name                            : E:\clips\Jack_Jac_Attack__720p.mkv
Format                                   : Matroska
Format version                           : Version 1
File size                                : 154 MiB
Duration                                 : 4mn 44s
Overall bit rate                         : 4 542 Kbps
Encoded date                             : UTC 2012-09-05 04:07:16
Writing application                      : mkvmerge v2.4.0 ('Fumbling Towards Ecstasy') built on Oct 11 2008 20:13:15
Writing library                          : libebml v0.7.7 + libmatroska v0.8.1

Video
ID                                       : 1
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : High@L4.0
Format settings, CABAC                   : No
Format settings, ReFrames                : 1 frame
Codec ID                                 : V_MPEG4/ISO/AVC
Duration                                 : 4mn 44s
Nominal bit rate                         : 4 608 Kbps
Width                                    : 1 280 pixels
Height                                   : 720 pixels
Display aspect ratio                     : 16:9
Frame rate mode                          : Constant
Frame rate                               : 23.976 fps
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.209
Writing library                          : x264 core 80 r1378+57 6f6b50a
Encoding settings                        : cabac=0 / ref=1 / deblock=1:0:0 / analyse=0x3:0x113 / me=hex / subme=7 / psy=1 / psy_rd=1.0:0.0 / mixed_ref=0 / me_range=16 / chroma_me=1 / trellis=0 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=-2 / threads=3 / sliced_threads=0 / nr=0 / decimate=1 / mbaff=0 / constrained_intra=0 / bframes=0 / wpredp=2 / keyint=250 / keyint_min=25 / scenecut=40 / rc_lookahead=40 / rc=abr / mbtree=1 / bitrate=4608 / ratetol=1.0 / qcomp=0.60 / qpmin=10 / qpmax=51 / qpstep=4 / ip_ratio=1.40 / aq=1:1.00
Default                                  : Yes
Forced                                   : No

Audio
ID                                       : 2
Format                                   : AAC
Format/Info                              : Advanced Audio Codec
Format profile                           : Main
Codec ID                                 : A_AAC
Duration                                 : 4mn 44s
Channel(s)                               : 2 channels
Channel positions                        : Front: L R
Sampling rate                            : 44.1 KHz
Compression mode                         : Lossy
Default                                  : Yes
Forced                                   : No

thanks

Julis

haha

↧

Reopening topologies

November 20, 2013, 8:08 am

≫ Next: Editing video file to blank out non relevant video data

≪ Previous: About AAC decoder config

Hi, guys!
I've got some issues with the reopening topologies during one media session. Following the msdn documentation I call IMFMediaSession::SetTopolgy with the MFSESSION_SETTOPOLOGY_CLEAR_CURRENT flag for clearing current topology and wait for the MESessionEnded event. After the event is fired I call Shutdown method for the media source and release related objects. (Can I be sure that the media session doesn't use this topology?) So, when I open a new topology I have a huge memory leak (The current topology wasn't closed correctly).

Does the media session close current topology incorrect? Do I do anything wrong?

Has anyone had the same problem?

Cheers!;)

↧

Editing video file to blank out non relevant video data

August 19, 2013, 2:45 pm

≫ Next: H.264 Encoder Degrades in Windows 8

≪ Previous: Reopening topologies

Input: offset, duration

The video is stored in an .m4v format and i need to open the video file and edit the contents of the file at the given offset for the given duration and write the file back to the same location. Is there an example that i can look at or can i get some pointers on how to proceed ?

thanks in advance

↧

H.264 Encoder Degrades in Windows 8

November 21, 2013, 6:33 am

≫ Next: Streaming video via http

≪ Previous: Editing video file to blank out non relevant video data

I have two different projects that encode raw frames into H.264 streams. I originally thought I might be doing something wrong, but now that both projects are affected, it seems I'm doing something wrong.

I have a pretty standard setup:

Find an encoder with MFTEnumEx, MFT_CATEGORY_VIDEO_ENCODER and MFT_ENUM_FLAG_SYNCMFT.
Build an output type with the following:
outputType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);outputType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_H264);outputType->SetUINT32(MF_MT_AVG_BITRATE, 10000000);MFSetAttributeSize(outputType, MF_MT_FRAME_SIZE, (UINT32)self->outputWidth, (UINT32)self->outputHeight);MFSetAttributeSize(outputType, MF_MT_FRAME_RATE, 30000, 1000);outputType->SetUINT32(MF_MT_INTERLACE_MODE, MFVideoInterlace_Progressive);outputType->SetUINT32(MF_MT_MPEG2_PROFILE, eAVEncH264VProfile_Base);
Find a matching input type (NV12 typically).
MFT_MESSAGE_NOTIFY_BEGIN_STREAMING

Then for each frame:

MFCreateMemoryBuffer
mediaBuffer->Lock
Copy the frame into the mediaBuffer
mediaBuffer->Unlock
MFCreateSample, add the mediaBuffer, set the sample time and duration.
ProcessInput
ProcessOutput

I then take the bytes out of the output sample and send them across the network.

On Windows 7, this works like a champ. No issues. On Windows 8, I can get about 2-3 seconds of perfect video and then everything goes terrible. The output data size goes for kilobytes in size to around 18 bytes. It never really recovers from this. Every once in a while I'll get a "good" packet out of the encoder, but the video playing on the other side is < 1 frame / second and blocky.

Is there something I'm missing when setting up the encoder?

↧

Streaming video via http

November 24, 2013, 12:28 am

≫ Next: How to apply HDCP?

≪ Previous: H.264 Encoder Degrades in Windows 8

What do I need to do to stream a media foundation video stream via http so it's compatible with e.g. the video HTML element? Do I need to introduce a http server of some sort? The server (media foundation) and client (html) are run entirely on the local machine.

↧

How to apply HDCP?

November 25, 2013, 5:27 pm

≫ Next: IMFTransform::SetOutputType is never called if IMFTransform::SetInputType returns MF_E_TRANSFORM_TYPE_NOT_SET

≪ Previous: Streaming video via http

I wanna apply HDCP, and then I implemented the IMFOutputPolicy. (MFPROTECTION_HDCP)

But, something call GetBlobSize method and GetBlob method. (GUID is MFPROTECTIONATTRIBUTE_HDCP_SRM)

I have implemented following.

HRESULT STDMETHODCALLTYPE CNcgOutputSchema::GetBlobSize( __RPC__in REFGUID guidKey, __RPC__out UINT32 *pcbBlobSize )
{
	WriteLogMessage(_T("CNcgOutputSchema::GetBlobSize"));
	if (MFPROTECTIONATTRIBUTE_HDCP_SRM == guidKey)
	{
		*pcbBlobSize = 48;
		return S_OK;
	}
	else
	{
		return m_pAttr->GetBlobSize(guidKey, pcbBlobSize);
	}
}

HRESULT STDMETHODCALLTYPE CNcgOutputSchema::GetBlob( __RPC__in REFGUID guidKey, __RPC__out_ecount_full(cbBufSize ) UINT8 *pBuf, UINT32 cbBufSize, __RPC__inout_opt UINT32 *pcbBlobSize )
{
	WriteLogMessage(_T("CNcgOutputSchema::GetBlob"));
	if (MFPROTECTIONATTRIBUTE_HDCP_SRM == guidKey)
	{
		if ( pBuf == NULL ) pBuf = new UINT8[cbBufSize];

		pBuf[0] = 0x80;
		pBuf[1] = 0x0;
		pBuf[2] = 0x0;
		pBuf[3] = 0x1;
		pBuf[4] = 0x0;
		pBuf[5] = 0x0;
		pBuf[6] = 0x0;
		pBuf[7] = 0x2b;
		pBuf[8] = 0xd2;
		pBuf[9] = 0x48;
		pBuf[10] = 0x9e;
		pBuf[11] = 0x49;
		pBuf[12] = 0xd0;
		pBuf[13] = 0x57;
		pBuf[14] = 0xae;
		pBuf[15] = 0x31;
		pBuf[16] = 0x5b;
		pBuf[17] = 0x1a;
		pBuf[18] = 0xbc;
		pBuf[19] = 0xe0;
		pBuf[20] = 0xe;
		pBuf[21] = 0x4f;
		pBuf[22] = 0x6b;
		pBuf[23] = 0x92;
		pBuf[24] = 0xa6;
		pBuf[25] = 0xba;
		pBuf[26] = 0x3;
		pBuf[27] = 0x3b;
		pBuf[28] = 0x98;
		pBuf[29] = 0xcc;
		pBuf[30] = 0xed;
		pBuf[31] = 0x4a;
		pBuf[32] = 0x97;
		pBuf[33] = 0x8f;
		pBuf[34] = 0x5d;
		pBuf[35] = 0xd2;
		pBuf[36] = 0x27;
		pBuf[37] = 0x29;
		pBuf[38] = 0x25;
		pBuf[39] = 0x19;
		pBuf[40] = 0xa5;
		pBuf[41] = 0xd5;
		pBuf[42] = 0xf0;
		pBuf[43] = 0x5d;
		pBuf[44] = 0x5e;
		pBuf[45] = 0x56;
		pBuf[46] = 0x3d;
		pBuf[47] = 0xe;

		
		return S_OK;
	}
	else
	{
		return m_pAttr->GetBlob(guidKey, pBuf, cbBufSize, pcbBlobSize);
	}
}

SRM data got from that <http://www.digital-cp.com/files/static_page_files/B1D05BF1-1A4B-B294-D04D3B608F06D5EE/HDCP.SRM>.

But, HDCP not applied. (Graphics card companies have different error. )

This SRM data right? If this data is wrong, please give some solution.

↧

IMFTransform::SetOutputType is never called if IMFTransform::SetInputType returns MF_E_TRANSFORM_TYPE_NOT_SET

November 26, 2013, 8:47 pm

≫ Next: Wrong duration metadata for mpeg4 file created with MF sink writer

≪ Previous: How to apply HDCP?

I'm using playback with media session and standart topology loader. Topology looks like:

Source -> video decoder -> custom MFT -> custom sink.

Sink supports only one input resolution. During type negotiation it's necessary to set output type of MFT before input type.IMFTransform::SetInputType() return MF_E_TRANSFORM_TYPE_NOT_SET, butIMFTransform::SetOutputType() is never called.

Does logic of media session differs from described in this article "Basic MFT processing model" ?

Shall I write custom topology builder to solve this problem?

↧

Wrong duration metadata for mpeg4 file created with MF sink writer

November 27, 2013, 2:49 am

≫ Next: Windows Media player with programs like Steam for games. Spotify or grooveshark for music and netflix for movies.

≪ Previous: IMFTransform::SetOutputType is never called if IMFTransform::SetInputType returns MF_E_TRANSFORM_TYPE_NOT_SET

Hello,

I use the mpeg 4 sink writer, on Windows 7, to create an MPEG4 file from H264 encoded samples that are received as a live stream. The same samples are also put into an MF pipeline, decoded and displayed in realtime to the user. The user can turn recording of the live stream on and off at any time.

The problem is that the duration written as metadata to the file by the sink writer at finalization tends to be not correct. The total duration metadata is longer than the actual duration, often significantly.

It seems that the duration metadata is calculated by the sink writer from presentation time 0, and not from the presentation time in the first sample it is given. Since the same sample instances are simultaneously decoded in real-time I can't very well change the sample presentation time before giving it to the sink writer (right?).

Is this a known problem/limitation with the sink writer, and is there some way to get around it?

The only workarounds I can think of is to make a clone of each sample, or fix the metadata explicitly after finalization. If there is a way to get the sink writer to do it would of course be preferable.

Grateful for any help, thanks,

Magnus

↧

Windows Media player with programs like Steam for games. Spotify or grooveshark for music and netflix for movies.

December 3, 2013, 1:11 am

≫ Next: Topology playback delay

≪ Previous: Wrong duration metadata for mpeg4 file created with MF sink writer

Windows Media player with programs like Steam for games. Spotify or grooveshark for music and netflix for movies.

Make the next Windows media player with funktion like this and it will rock!

↧

Topology playback delay

December 3, 2013, 2:30 pm

≫ Next: Custom Media Sink consumes too much memory

≪ Previous: Windows Media player with programs like Steam for games. Spotify or grooveshark for music and netflix for movies.

I am working on a media application which must support the following scenario:

Suppose we have an MP4 file containing an audio and video stream, and an MP3 file. We would like to combine these two files together and write the result to a file, or play them back mixed in real time.

↧

Custom Media Sink consumes too much memory

December 3, 2013, 3:59 pm

≫ Next: Encoding from VRAM with the GPU without moving data via system memory

≪ Previous: Topology playback delay

Hello,

I have created a Custom Media Sink that has 2 inputs -H264 and AAC. It is based on WavSink sample. Just for the sake of testing I've code it so, that in ProcessSample call (in each stream sink - video and audio) it just receives theIMFSample, requests another sample and just returns S_OK.

The test application. It creates my Custom Media Sink using MFCreateSinkWriterFromMediaSink. It then adds H264 and AAC streams. Also, it adds H264 and AAC MFT's and CColorConvertDMOvideo processor (for re-sizing input images). On start I read array of images and wav samples and feed the sink writer.

If instead My Custom Media Sink, I use MFCreateSinkWriterFromURL everything works great. I get an output h.264/aac file (with any video bitrate)

The problem. I use my sink, and on the input I have SD images, that should be re-sized to720p (which is what I require from the H.264 encoder for output). Now, if I set the bitrate of the H.264 encoder below6000kbps, then everything works OK. If I request 8000kbpsor higher, then the memory footprint of my application rapidly starts to grow, until it hits the x86 limit. Now, this is not a memory leak, since if I stop the encoding, the memory falls back to normal, and on shutdown I get no memory leak dumps from the debugger.

I'm thinking that the problem is (probably) somewhere in the re-sizer and/or encoder because they can't keep up and are just dispatching new samples (does its number has limit?) and that my Custom Media Sink is (not)doing something, thus fooling them, somehow. I'm sure that the problem is somewhere in my Sink, because the same setup works with the file writer.

Also, I made sure, when making the writer, to enable throttling (MF_SINK_WRITER_DISABLE_THROTTLING, FALSE) so the sink writer would throttle requesting samples, if the client is pushing too fast.

Can anybody share some thoughts about this? What I might be doing wrong?

Thanks, Goran.

↧

Encoding from VRAM with the GPU without moving data via system memory

December 6, 2013, 2:47 pm

≫ Next: How to get next frame in pause mode?

≪ Previous: Custom Media Sink consumes too much memory

Hi,

when i investigated in using the GPU for encoding i found a lot of people meantioning that the speed advantage is only marginally better. Some claimed 20 %, but when i tested out i came to an advantage of 1-2 %.

I thought that logically and technically there must be something wrong and i went inspecting my apps pipe again. In most "regular" cases developers using the GPU for decoding/encoding/transcoding/ existing files, so the data is in system memory and is moved to the GPU and then back again. Video files are already existent and the frames can be consumed as fast as possible, the encoding would look like :

sysmem(cpu)->vram(gpu)->sysmem(cpu)

In my case i have a "non-regular" GPU encoding at work. When i set my live encoding to lets say 25 fps, then only every 40 milliseconds a frame becomes available for encoding. That alone screws the GPU speed advantage already, but the more important problem is the whole delivering process to the GPU. My apps encoding looks this way :

vram(gpu)->sysmem(cpu)->sysmem(cpu)->vram(gpu)->sysmem(cpu)

The source in my encoding is a D3D surface ( IDirect3DSurface9 or ID3D10/11Texture2D) and is copied to system memory so that the CPU or GPU implemented encoders can process them. The 2 sysmem in a row is because of RGBA to BGRA color conversion, from sysmem the bytes get swapped and copied into a media buffer which is also in sysmem. And even though i am using SSE2 and AVX "non-temporal" store functions this step in between of course slows down the whole encoding process by a good margin. Then the data gets moved to the GPU, and after encoding is done it gets moved back to system memory for storage.

After i had inspected this my mind went directly onto the barricades, why all that senseless moving around when the data is already in vram and GPU has full access to it. I could even use the GPU for color conversion and would not have to move the data to system memory. The encoding should look like this:

vram(gpu)->sysmem(cpu)

I read about "Hardware Handshake Sequence" and that one hardware MFT can connect its output to another hardware MFTs input by using MFT_CONNECTED_STREAM_ATTRIBUTE and MFT_CONNECTED_TO_HW_STREAM. As i understand it its not possible for Hardware MFTs to consume data directly from device memory. The closest i could find was this thread where it seems that it is possible with Intel Quick Sync. In my opinion its technically possible with all GPU encoders, be it NVIDIA, AMD or INTEL, the question is how much of extra work outside of media foundation need to be done. Or is it possible to rewrite the MFT implementation these vendors did and make the MFT consuming from hardware memory ?

Another thought of mine is a little bit futuristic scenarion as it relates to "Unified Memory". Theoretically when i have a platform that is using this model like lets say huma from AMD then it should be possible to use MFCreateDXGISurfaceBuffer and send the sample to the hardware MFT. Internaly it would only handle the pointer and the MFT could directly consume without the need to move the data before, all that of course would be only possible if Microsoft would implement the support for these architecture in their memory handling.

Puh walls of text, but i hope that someone has knowledge or maybe just an idea to share. I will invest further into this as it seems to be the fastest encoding of D3D surfaces possible.

regards

↧

How to get next frame in pause mode?

December 7, 2013, 7:44 am

≫ Next: Suggestions for encoding 3 custom live sources, 2 audio and 1 video, to single video file

≪ Previous: Encoding from VRAM with the GPU without moving data via system memory

The task is to process video stream frame by frame. I have built a DirectShow graph which render stream to my one DS filter. I've set graph in a pause mode, and expected to receive next frame. I use IMediaSeeking interface for that.

LONGLONG pos=0;
hr = m_pSeek->SetPositions(&pos, AM_SEEKING_AbsolutePositioning,
		NULL, AM_SEEKING_NoPositioning);

OK, I've got the first frame. How to get the next?! SetTimeFormat(TIME_FORMAT_FRAME) ->E_NOTIMPL

The only way I found is to increase pos value by value AvgTimePerFrame from VIDEOINFOHEADER or VIDEOINFOHEADER2 structure which I got from CMediaType in function CBaseVideoRenderer::SetMediaType of my render filter.

But carefull checking shown that I missed some frames at this approach.

Any ideas for more correct solution? At least some players can show video frame by frame, I understand possible problems of absolute seeking, but that simple task - just one frame forward must have reliable solution?!

↧