Video playback using the Wandboard VPU

This subforum is for discussing blogposts/articles and for other comments about the wandboard site.

Video playback using the Wandboard VPU

Postby blackibiza » Fri Jan 17, 2014 8:15 am

Hi

in the article there was no mention about which operating system and type of arm architecture used: was it armel or armhf?
In case of Linux/Ubuntu and armhf, is Wandboard team planning to release those drivers to the community? That would be great :D

Regards

Michele
blackibiza
 
Posts: 13
Joined: Wed Dec 04, 2013 11:24 am

Re: Video playback using the Wandboard VPU

Postby Tapani » Fri Jan 17, 2014 9:22 am

It is a Linux (3.0.35) kernel patch.

Meaning, it applies to armel/armhf/whatever userlands you put on top of that. And yes, the patch is released. See link in article.
Tapani
Site Admin
 
Posts: 701
Joined: Tue Aug 27, 2013 8:32 am

Re: Video playback using the Wandboard VPU

Postby wolfgar » Fri Jan 17, 2014 1:28 pm

Hi tapani,

Thanks for posting this patch.
Some XBMC users have reported spurious allocations errors that are very likely related to this issue...
On my side I was faced with difficulties to reproduce it so I can not 100% sure but odds are high you have just solved this

Best regards
Stephan
wolfgar
 
Posts: 12
Joined: Sat Oct 12, 2013 12:08 am

Re: Video playback using the Wandboard VPU

Postby CruX » Fri Jan 17, 2014 3:28 pm

Hey,

I have just compiled a kernel with this patch and will report back as soon as I get home!
Can't wait!

Cheers,
CruX

Edit: Alright, I patched both b65ef76276775ed1781f7a767610f52d7939cf86 and 7cbd06b2904c1855109084ca6b0c84990bc69233 on Stephans kernel.
720p playback does stutter now (right after reboot, playback from sata), also I get this on serial:
Code: Select all
[  388.822570] __dma_free_remap: trying to free invalid coherent area:   (null)
[  388.829687] Backtrace:
[  388.832260] [<c0046038>] (dump_backtrace+0x0/0x10c) from [<c050deec>] (dump_stack+0x18/0x1c)
[  388.840749]  r6:00000000 r5:00040000 r4:00000000 r3:00000000
[  388.846549] [<c050ded4>] (dump_stack+0x0/0x1c) from [<c004d188>] (dma_free_coherent+0x9c/0x1c8)
[  388.855353] [<c004d0ec>] (dma_free_coherent+0x0/0x1c8) from [<c03c0be0>] (vpu_free_dma_buffer+0xd8/0xe8)
[  388.864879] [<c03c0b08>] (vpu_free_dma_buffer+0x0/0xe8) from [<c03c0e10>] (vpu_release+0x220/0x2b0)
[  388.874204]  r8:d4022ee0 r7:c06f55c4 r6:00000000 r5:c073cfe0 r4:c073efd8
[  388.881002] r3:00000018
[  388.883687] [<c03c0bf0>] (vpu_release+0x0/0x2b0) from [<c0106cfc>] (fput+0x118/0x1e0)
[  388.891531]  r7:d20075e0 r6:00000008 r5:d44604a0 r4:d4718f40
[  388.897417] [<c0106be4>] (fput+0x0/0x1e0) from [<c010308c>] (filp_close+0x78/0x84)
[  388.905014] [<c0103014>] (filp_close+0x0/0x84) from [<c010312c>] (sys_close+0x94/0xcc)
[  388.912948]  r6:d4e83d00 r5:d4e83cc0 r4:00000018 r3:00000000
[  388.918701] [<c0103098>] (sys_close+0x0/0xcc) from [<c0042860>] (ret_fast_syscall+0x0/0x30)
[  388.927068]  r7:00000006 r6:41169dfc r5:41169dec r4:41169f2c


With only the new allocator I still get the same error, but the stuttering (audio sync going loose) is better, if not at all gone (I'm pretty paranoid about async audio, so this might as well be placebo at this point :D)
Code: Select all
[  135.702604] __dma_free_remap: trying to free invalid coherent area:   (null)
[  135.709885] Backtrace:
[  135.712427] [<c0046038>] (dump_backtrace+0x0/0x10c) from [<c050de4c>] (dump_stack+0x18/0x1c)
[  135.721328]  r6:00000000 r5:00040000 r4:00000000 r3:00000002
[  135.727148] [<c050de34>] (dump_stack+0x0/0x1c) from [<c004d188>] (dma_free_coherent+0x9c/0x1c8)
[  135.735931] [<c004d0ec>] (dma_free_coherent+0x0/0x1c8) from [<c03c0b40>] (vpu_free_dma_buffer+0xd8/0xe8)
[  135.745883] [<c03c0a68>] (vpu_free_dma_buffer+0x0/0xe8) from [<c03c0d70>] (vpu_release+0x220/0x2b0)
[  135.755338]  r8:e9fc7260 r7:c06f55c4 r6:00000000 r5:c073cfe0 r4:c073efd8
[  135.761989] r3:00000018
[  135.764697] [<c03c0b50>] (vpu_release+0x0/0x2b0) from [<c0106c80>] (fput+0x118/0x1e0)
[  135.773271]  r7:e600ac40 r6:00000008 r5:e9cf24a0 r4:e49dc0e0
[  135.779348] [<c0106b68>] (fput+0x0/0x1e0) from [<c0103010>] (filp_close+0x78/0x84)
[  135.786942] [<c0102f98>] (filp_close+0x0/0x84) from [<c01030b0>] (sys_close+0x94/0xcc)
[  135.795029]  r6:e9fd45e0 r5:e9fd45a0 r4:00000021 r3:00000000
[  135.800769] [<c010301c>] (sys_close+0x0/0xcc) from [<c0042860>] (ret_fast_syscall+0x0/0x30)
[  135.809299]  r7:00000006 r6:41170dfc r5:41170dec r4:41170f2c


After some (short) testing it seems like the vpu allocation error seems to be fixed though!
Edit:
Longer testing revealed, that the error seems to be better but has not been fixed though: Running the system for a long time, then starting vpu playback still results in a kernel panic. (I didn't have serial active at that point :/).
Last edited by CruX on Mon Jan 20, 2014 9:21 am, edited 1 time in total.
CruX
 
Posts: 68
Joined: Sun Oct 27, 2013 1:29 pm

Re: Video playback using the Wandboard VPU

Postby sentientnz » Sat Jan 18, 2014 5:59 pm

Isn't this the reason CMA was implemented in the kernel - to ensure large blocks on continous memory would be available, and then not allocate small blocks from it until necessary?
sentientnz
 
Posts: 7
Joined: Sat Jan 18, 2014 5:55 pm

Re: Video playback using the Wandboard VPU

Postby nicolauz » Tue May 06, 2014 8:49 am

Hi,

I'm use the same patch on a custom board.
It seems to fix the DMA "Physical memory..."-error!

Thanks for that!

I also see the same "__dma_free_remap: trying to free invalid coherent area" errors!

Now there seems to be a similar/new memory leak/problem ... now with 'Normal' memory.
After a lot (~4000) play-cycles I get a 'out of memory' error and the kernel kill processes (normally the video player first).

As the kernel errors somewhat points to a memory problem ... I'd like to fix that before investing time somewhere else.

Anyone a clue why/how this errors is produced and what it might incline?




regarding OpenGL-playback:
QtMultimedia does something very similar to glimagesink (but within it's own opengl context):
Binding the vpu-memory (falling out of gstreamer) to a texture, rather then copy the data.

See my short video about it: https://www.youtube.com/watch?v=pmxsWGhrrBQ
.. it's running on the sabreauto, but I had that running on wandboard solo as well (although 1080p+shadereffects won't perform on the small gpu)
nicolauz
 
Posts: 7
Joined: Tue May 06, 2014 8:29 am

Re: Video playback using the Wandboard VPU

Postby nicolauz » Wed May 07, 2014 9:18 am

Hi,

I did further test this with a plain c/glib/gstreamer program:
main.c
(3.33 KiB) Downloaded 351 times


It basically builds a gst-pipeline & -mainloop, plays a very short (~1sec) .mp4 file and then destroys everything, ...and again...

For 'logging' the memory status it will call:
foo.sh
(173 Bytes) Downloaded 310 times


The log is way too huge to upload it (and .xz isn't allowed to be uploaded).

There is no leak of DMA memory anymore with the kernel-change! Thanks again.

But it slowly leaks "Normal" memory:
- at start it's 100mb free
- after around 30.000 play-cycles it's <2mb


I'm not 100% sure if the memory leak is caused by my small, simple main.c or gstreamer.
I think main.c cleans up everything properly ... but my glib experience is very limited :)
nicolauz
 
Posts: 7
Joined: Tue May 06, 2014 8:29 am

Re: Video playback using the Wandboard VPU

Postby Tapani » Fri May 09, 2014 9:43 am

Try this patch:
Code: Select all
iMX6 VPU: Do not allow null pointers to be inserted to free memory pool

diff --git a/drivers/mxc/vpu/mxc_vpu.c b/drivers/mxc/vpu/mxc_vpu.c
index e478272..2369989 100644
--- a/drivers/mxc/vpu/mxc_vpu.c
+++ b/drivers/mxc/vpu/mxc_vpu.c
@@ -123,13 +123,13 @@ static unsigned int pc_before_suspend;
 #define DMA_MEM_MAX_CHUNKS 8
 /* Pool graqnularity: must be power of 2, 128k or 256k are recommended */
 #define DMA_MEM_CHUNKSIZE (1 << 18)
-static u32 vpu_dma_mem_free_list[DMA_MEM_MAX_CHUNKSIZES][DMA_MEM_MAX_CHUNKS] = {{ 0 }};
-static u32 vpu_dma_mem_phys_free_list[DMA_MEM_MAX_CHUNKSIZES][DMA_MEM_MAX_CHUNKS] = {{ 0 }};
-static u32 vpu_dma_mem_nof_free[DMA_MEM_MAX_CHUNKSIZES] = { 0 };
-static u32 vpu_dma_mem_total_allocs = 0;
-static u32 vpu_dma_mem_in_use = 0;
-static u32 vpu_dma_mem_pooled = 0;
-static u32 vpu_dma_mem_peak_usage = 0;
+static volatile u32 vpu_dma_mem_free_list[DMA_MEM_MAX_CHUNKSIZES][DMA_MEM_MAX_CHUNKS] = {{ 0 }};
+static volatile u32 vpu_dma_mem_phys_free_list[DMA_MEM_MAX_CHUNKSIZES][DMA_MEM_MAX_CHUNKS] = {{ 0 }};
+static volatile u32 vpu_dma_mem_nof_free[DMA_MEM_MAX_CHUNKSIZES] = { 0 };
+static volatile u32 vpu_dma_mem_total_allocs = 0;
+static volatile u32 vpu_dma_mem_in_use = 0;
+static volatile u32 vpu_dma_mem_pooled = 0;
+static volatile u32 vpu_dma_mem_peak_usage = 0;
 static spinlock_t vpu_dma_mem_lock;
 
 /* Helper function for the pooled dma allocator */
@@ -153,23 +153,39 @@ static int vpu_alloc_dma_buffer(struct vpu_mem_desc *mem)
                vpu_dma_mem_nof_free[size_in_chunks]--;
                mem->cpu_addr = vpu_dma_mem_free_list[size_in_chunks][vpu_dma_mem_nof_free[size_in_chunks]];
                mem->phy_addr = vpu_dma_mem_phys_free_list[size_in_chunks][vpu_dma_mem_nof_free[size_in_chunks]];
+                if ((void *)(mem->cpu_addr) == NULL) printk(KERN_ERR "VPU : NULL pointer in free pool!?\n");
+//                else printk(KERN_ERR "VPU : got ptr %p from pool ix %i\n", mem->cpu_addr, size_in_chunks);
                vpu_dma_mem_pooled -= size_in_chunks * DMA_MEM_CHUNKSIZE;
                vpu_dma_mem_in_use += size_in_chunks * DMA_MEM_CHUNKSIZE;
-               spin_unlock(&vpu_dma_mem_lock);
        } else {
-               spin_unlock(&vpu_dma_mem_lock);
                mem->cpu_addr = (unsigned long)
                        dma_alloc_coherent(NULL, PAGE_ALIGN(size_in_chunks * DMA_MEM_CHUNKSIZE),
                                (dma_addr_t *) (&mem->phy_addr),
                                GFP_DMA | GFP_KERNEL);
+//                printk(KERN_ERR " new alloc %i kB\n", PAGE_ALIGN(size_in_chunks * DMA_MEM_CHUNKSIZE));
                if ((void *)mem->cpu_addr != NULL)
                        vpu_dma_mem_in_use += size_in_chunks * DMA_MEM_CHUNKSIZE;
        }
         
        if (vpu_dma_mem_in_use > vpu_dma_mem_peak_usage) vpu_dma_mem_peak_usage = vpu_dma_mem_in_use;
 
+        spin_unlock(&vpu_dma_mem_lock);
+
        if ((void *)(mem->cpu_addr) == NULL) {
-               printk(KERN_ERR "Physical memory allocation error!\n");
+               int i;
+
+                printk(KERN_ERR "VPU : Physical memory allocation error!\n");
+
+                for (i=0; i<DMA_MEM_MAX_CHUNKSIZES; i++) {
+                        if (vpu_dma_mem_nof_free[i] > 0) {
+                                printk(KERN_ERR " %2i : %4ikB : %i\n", i, DMA_MEM_CHUNKSIZE*i >> 10, vpu_dma_mem_nof_free[i]);
+                        }
+                }
+                printk(KERN_ERR "Total pooled : %4ikB\n", vpu_dma_mem_pooled >> 10);
+                printk(KERN_ERR "Current usage : %4ikB\n", vpu_dma_mem_in_use >> 10);
+                printk(KERN_ERR "Max usage : %4ikB\n", vpu_dma_mem_peak_usage >> 10);
+                printk(KERN_ERR "Last request : %4ikB [ chunk = %4ikB, pool ix = %2i ]\n", mem->size >> 10, chunked_size >> 10, size_in_chunks);
+
                return -1;
        }
        return 0;
@@ -185,19 +201,20 @@ static void vpu_free_dma_buffer(struct vpu_mem_desc *mem)
 
        spin_lock(&vpu_dma_mem_lock);
 
-       if (size_in_chunks < DMA_MEM_MAX_CHUNKSIZES && vpu_dma_mem_nof_free[size_in_chunks] < DMA_MEM_MAX_CHUNKS) {
-               vpu_dma_mem_in_use -= PAGE_ALIGN(chunked_size);
-               vpu_dma_mem_free_list[size_in_chunks][vpu_dma_mem_nof_free[size_in_chunks]] = mem->cpu_addr;
-               vpu_dma_mem_phys_free_list[size_in_chunks][vpu_dma_mem_nof_free[size_in_chunks]] = mem->phy_addr;
-               vpu_dma_mem_nof_free[size_in_chunks]++;
-               vpu_dma_mem_pooled += size_in_chunks * DMA_MEM_CHUNKSIZE;
-               spin_unlock(&vpu_dma_mem_lock);
-       } else {               
-               spin_unlock(&vpu_dma_mem_lock);
-               vpu_dma_mem_in_use -= PAGE_ALIGN(chunked_size);
-               dma_free_coherent(0, PAGE_ALIGN(chunked_size),
+       if ((void *)mem->cpu_addr != NULL) {
+               if (size_in_chunks < DMA_MEM_MAX_CHUNKSIZES && vpu_dma_mem_nof_free[size_in_chunks] < DMA_MEM_MAX_CHUNKS) {
+                       vpu_dma_mem_in_use -= PAGE_ALIGN(chunked_size);
+                       vpu_dma_mem_free_list[size_in_chunks][vpu_dma_mem_nof_free[size_in_chunks]] = mem->cpu_addr;
+                       vpu_dma_mem_phys_free_list[size_in_chunks][vpu_dma_mem_nof_free[size_in_chunks]] = mem->phy_addr;
+                       vpu_dma_mem_nof_free[size_in_chunks]++;
+                       vpu_dma_mem_pooled += size_in_chunks * DMA_MEM_CHUNKSIZE;
+               } else {
+                       vpu_dma_mem_in_use -= PAGE_ALIGN(chunked_size);
+                       dma_free_coherent(0, PAGE_ALIGN(chunked_size),
                                  (void *)mem->cpu_addr, mem->phy_addr);
+               }
        }
+       spin_unlock(&vpu_dma_mem_lock);
 }
 
 /*!
Tapani
Site Admin
 
Posts: 701
Joined: Tue Aug 27, 2013 8:32 am

Re: Video playback using the Wandboard VPU

Postby nicolauz » Wed May 21, 2014 9:54 am

Sorry for my late reply.
Thanks for the patch.
I wasn't able to use copy&past it, so I copied it line by line ... hope I haven't broken anything ;)
.. attached as file.

With that patch I'll get a kernel panic when playback starts:

Code: Select all
root@amherst:~# BUG: scheduling while atomic: aiurdemux0:sink/266/0x00000000
Modules linked in:
[<80048304>] (unwind_backtrace+0x0/0xf4) from [<8054768c>] (__schedule+0x6a0/0x77c)
[<8054768c>] (__schedule+0x6a0/0x77c) from [<8054962c>] (__down_write_nested+0xac/0xe4)
[<8054962c>] (__down_write_nested+0xac/0xe4) from [<80106324>] (sys_mmap_pgoff+0x58/0xc8)
[<80106324>] (sys_mmap_pgoff+0x58/0xc8) from [<80041840>] (ret_fast_syscall+0x0/0x30)
BUG: scheduling while atomic: aiurdemux0:sink/266/0x00000000
Modules linked in:
[<80048304>] (unwind_backtrace+0x0/0xf4) from [<8054768c>] (__schedule+0x6a0/0x77c)
[<8054768c>] (__schedule+0x6a0/0x77c) from [<800418a0>] (ret_slow_syscall+0x0/0x4)
BUG: scheduling while atomic: aiurdemux0:sink/266/0x00000000
Modules linked in:
[<80048304>] (unwind_backtrace+0x0/0xf4) from [<8054768c>] (__schedule+0x6a0/0x77c)
[<8054768c>] (__schedule+0x6a0/0x77c) from [<800418a0>] (ret_slow_syscall+0x0/0x4)
 clk_disable cannot be called in an interrupt context
[<80048304>] (unwind_backtrace+0x0/0xf4) from [<8006375c>] (clk_disable+0x80/0x90)
[<8006375c>] (clk_disable+0x80/0x90) from [<803ec6f4>] (vpu_ioctl+0x730/0x8b8)
[<803ec6f4>] (vpu_ioctl+0x730/0x8b8) from [<8012e3d8>] (do_vfs_ioctl+0x3b4/0x530)
[<8012e3d8>] (do_vfs_ioctl+0x3b4/0x530) from [<8012e588>] (sys_ioctl+0x34/0x60)
[<8012e588>] (sys_ioctl+0x34/0x60) from [<80041840>] (ret_fast_syscall+0x0/0x30)
kernel BUG at arch/arm/plat-mxc/clock.c:132!
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = 90684000
[00000000] *pgd=2066d831, *pte=00000000, *ppte=00000000
Internal error: Oops: 817 [#1] PREEMPT SMP
Modules linked in:
CPU: 0    Not tainted  (3.0.35+yocto+g396eec4 #1)
PC is at __bug+0x1c/0x24
LR is at __bug+0x18/0x24
pc : [<80044ca4>]    lr : [<80044ca0>]    psr: 600f0013
sp : 90b85ed0  ip : 802d1dd4  fp : 371cf86c
r10: 903e5430  r9 : 90b84000  r8 : 800419c4
r7 : 90588460  r6 : 371cf86c  r5 : 07ffff00  r4 : 80b0cdb4
r3 : 00000000  r2 : ffffffff  r1 : 00000001  r0 : 00000033
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c53c7d  Table: 2068404a  DAC: 00000015
Process aiurdemux0:sink (pid: 266, stack limit = 0x90b842f0)
Stack: (0x90b85ed0 to 0x90b86000)
5ec0:                                     00000000 8006376c 80ba8054 00000000
5ee0: 00000000 803ec6f4 90666340 80b03f30 90b85fac 80547108 90b85f1c 800728d4
5f00: 90588460 0000000f 371cf86c 90588460 800419c4 8012e3d8 90b85f4c 80074958
5f20: 00000000 200f0093 90588460 8003b0a0 8003b0a0 8003b0a0 8003a24c 8003b0a0
5f40: 90666388 9066637c 9059da40 80276f6c 90666380 800f0013 00000000 38c55000
5f60: 38c55000 00005607 0000000f 371cf86c 90588460 800419c4 90b84000 00000000
5f80: 00000000 8012e588 371cf858 00000001 00016900 368f07c4 00000000 368f0b14
5fa0: 00000036 80041840 368f07c4 00000000 0000000f 00005607 371cf86c 368f09e8
5fc0: 368f07c4 00000000 368f0b14 00000036 368f0a08 4c8fb7e4 351101f0 00000000
5fe0: 368f088c 371cf864 368d72c8 4c101a7c 600f0010 0000000f fcf3df4a c2bf55ec
[<80044ca4>] (__bug+0x1c/0x24) from [<8006376c>] (clk_debug_register+0x0/0x20c)
[<8006376c>] (clk_debug_register+0x0/0x20c) from [<80b03f30>] (0x80b03f30)
Code: e3040828 e34800a2 eb13f754 e3a03000 (e5833000)
---[ end trace 97bcb751d6480e64 ]---
Kernel panic - not syncing: Fatal exception in interrupt
[<80048304>] (unwind_backtrace+0x0/0xf4) from [<805428e0>] (panic+0x84/0x198)
[<805428e0>] (panic+0x84/0x198) from [<8004516c>] (die+0x27c/0x2d0)
[<8004516c>] (die+0x27c/0x2d0) from [<805426cc>] (__do_kernel_fault.part.3+0x64/0x74)
[<805426cc>] (__do_kernel_fault.part.3+0x64/0x74) from [<8054c568>] (do_page_fault+0x1d4/0x3a4)
[<8054c568>] (do_page_fault+0x1d4/0x3a4) from [<8003c380>] (do_DataAbort+0x38/0x98)
[<8003c380>] (do_DataAbort+0x38/0x98) from [<8054a250>] (__dabt_svc+0x70/0xa0)
Exception stack(0x90b85e88 to 0x90b85ed0)
5e80:                   00000033 00000001 ffffffff 00000000 80b0cdb4 07ffff00
5ea0: 371cf86c 90588460 800419c4 90b84000 903e5430 371cf86c 802d1dd4 90b85ed0
5ec0: 80044ca0 80044ca4 600f0013 ffffffff
[<8054a250>] (__dabt_svc+0x70/0xa0) from [<80044ca4>] (__bug+0x1c/0x24)
[<80044ca4>] (__bug+0x1c/0x24) from [<8006376c>] (clk_debug_register+0x0/0x20c)
[<8006376c>] (clk_debug_register+0x0/0x20c) from [<80b03f30>] (0x80b03f30)
Attachments
do_not_allow_null_pointers.patch
(5.02 KiB) Downloaded 303 times
nicolauz
 
Posts: 7
Joined: Tue May 06, 2014 8:29 am

Re: Video playback using the Wandboard VPU

Postby nicolauz » Tue May 27, 2014 2:09 pm

Has anyone else tested the second patch?
Anyone able to get rid of the kernel-messages somehow else?
nicolauz
 
Posts: 7
Joined: Tue May 06, 2014 8:29 am

Next

Return to Article talkback and site feedback

Who is online

Users browsing this forum: No registered users and 3 guests

cron