preload() and unload() optimization & performance

I’ve been struggling to get the drop-free movie switching that is allegedly possible with proper preloading techniques. Several Facebook group posts yielded more questions than answers so I’m re-posting here as suggested by @raganmd

Essentially I step though a group of moviefilein TOPs calling preload() method in a “Deck B” without disrupting playback of “Deck A” … in the wiki it says:

This leaves me wondering something: Is there any need to stagger calling preload such that the next index movie doesn’t get a preload call until movie at current index returns isFullyPreRead == True? I drop frames when the preloads happen on an entire scene of assets from a single script run.

I also encountered a VRAM leak: Does calling preload() on an already loaded moviefileinTOP also unload the previous memory or overwrite it? or is it necessary to call unload() for those TOPs before preload() ? Does the cacheMemory=True flag have any special considerations? The only note I saw was that it’s helpful when assets have the same res/aspect (mine do). Can it cause VRAM leak if I missed some other step? For now cacheMemory=False is my fix, but I’d like to be as efficient as possible with caching.

Along with lots of textport printing, I’ve been staring at performance monitor trying to figure this out: I see lots of “waiting for frame for 0” line items when things slow down. What does that mean? Particularly when the file is a single still frame png? I tried preload(0) to force frame 0 to be ready, also tried setting alwaysloadinitial = 1, but still sometimes see “waiting” for upwards of 18ms when the images are played played at start of scene… shouldn’t they be ready to play if I called preload()?

I’m attaching a very simplified example of my setup, but it doesn’t experience the same slowdowns as my full project… am I doing it right?

Thanks for any insights,

_Will
preloadSimpleExample_CutMod.zip (1.45 MB)

Hey _Will,

Looking over your example. A few things, not in any particular order:

cachememory=True is worth using when assets new assets are the same resolution as previous assets - one 1920 x 1080 image / video for another. If your dimensions differ from asset to asset it’s my understanding that this does not offer a performance benefit. The idea here being that part of the loading call requires the allocation of VRAM, cachemeory=True allows you to re-use the existing allocation provided that it’s the same size.

preload() doesn’t require an argument, and unless you’re looking to preload at a particular frame you can just call preload().

In your example if I take out executes except for chopexec_preload, and replace cacheMemory=False with True, and just preload() I see pretty good results.

It’s worth considering that your set-up that places textures in the background is circumventing the preload() process as it’s displaying the contents of the moviefilein TOP, which is in turn causing it to cook immediately. Really nailing this often means doing a bit of blind work - since looking at files in TOPs means cooking them… which has been frustrating when I’ve chased this puzzle myself.

How big is a typical scene, in terms of assets, in your lager system? My experience with executes is that they’ll aim to fire all calls on the same frame, so if you’re working with a handful or less of files you’re in the clear. If, however, you’re working with over a dozen you’ll need a solution that doesn’t bottle neck. I’ve had good luck using timers for these operations, where the number of cycles matches the number of moviefile in objects, and each cycle refreshes the contents of a moviefilein. In some cases it’s enough set the cycle offset as 1 frame, though I usually end up around 2-5 frames if the images are large.

Another viable technique here is to stack your frames into movies where each frame is an individual image, then use the pre-fill option on a cache or texture3D - I’ve had good luck with that as well.

When I’m highly focused on performance and I know the set of assets I’m going to work with then I’ll load them in caches or texture3Ds on start then just retrieve for use. This works well for lots of applications, especially when there’s ample VRAM, and all assets are known, even if the sequences are not.

I think a lot of this is to say that the “right” technique for fast loading and optimization often depends on the use case and several other expected behaviors. For example, we had an installation where we were able to successfully swap decks of 80+ HD hap vids stutter free in less than 2 seconds… but the pushback was that it wasn’t instantaneous. Instant and all at once is a tall order once you approach scale - so some of this is also about what your looking to do in the end.

As always, drive speed and connection also matter - especially for video.

Sorry, that may add more pieces to your puzzle rather than offer insights. :confused:

1 Like

Ok so it looks like my basic understanding of cacheMemory is consistent with yours… very weird that it seemed to cause VRAM leaks in my system. (I only suspect that’s the culprit because disabling cacheMemory “fixed” the leak.)

I’m curious if the memory allocation stays locked to the respective moviefileinTOP: When I call unload(cacheMemory=True) for files that I won’t need for a few minutes in one scene, and then another scene’s players try to preload their files, could that cause a VRAM leak/crash? Does the allocated memory become un-usable to all processes other than their respective TOPs? I want to keep the different types of scenes as their own decks that pre/unload as needed because they have different grouping/compositing/behavior…

I had only tried that optional preload(0) arg because the Performance Monitor showed “waiting for frame for 0” (not a typo on my part, btw. Not sure what the extra “for” is for)… didn’t help.

I realized that leaving display flag enabled kinda defeated the purpose. In the actual project I’m pretty sure that incidental force-cook doesn’t happen. All viewers are disabled and project runs in perform mode at launch… I did find it annoying to have to work in a way that avoid cookings, because Info CHOP forces the movie to cook, so I can’t retrieve “fully_pre_read” channel to drive other scripts.

The example I shared is just one small module within a scene, and many other asset modules contain movies, not stills. A typical scene fades interactively between 5 background loops in h.264, with a couple HAPQ-A overlays, a few PNG stills and several GB of VRAM taken by realtime FX (GPU particles and other 32bit TOPs). All of that x2, for the 2 screens. I also have some shared assets cached, with only 8GB VRAM total (P4000). So I’m barely squeaking by at our low bar of 24fps. I will try moving the preload() calls into a Timer CHOP’s callback and see if that prevents some of the stalling/hanging.