How to use your RAM with the GPU (nope, just a BAD bug in the log please DAZ fix it)
Padone
Posts: 3,688
This came out while I was trying to help Hesperasmith to solve his issues.
One very interesting thing about disabling texture compression and optix, is that your card can go outside its dedicated memory, and it still works using the GPU (no cpu reversing).
It seems that this way the card can use the shared system memory that is available to the GPU (vram+ram). While if you use texture compression and/or optix then it can only use the dedicated memory (vram). So, if you resize textures yourself and disable compression, then you have much more memory to play with your scenes.
Post edited by Padone on
Comments
this sounds like Octane's out of core feature for textures and is useful information indeed!
Yes, be careful though because disabling texture compression with the average daz content leaves you with huge texture memory requirements, if you don't resize them down yourself. For example the G3F alone goes from 350 MB with texture compression, to about 2.5 GB without texture compression. So may be this is more academic than useful, but anyway cool to know of.
How does one disable texture compression?
Just go to the iray panel and set very high thresholds, so compression is never trigged. For example if you know the max texture size in your scene is 4K, then you set 8K for mid and 16K for high. Don't forget to disable optix too.
render settings > advanced > compression
Ah, no wonder, I was looking for some hidden on off button I couldn't find. Will give it a go when I got some time to mess around.
I did some more tests with compression turned off and I found some interesting results.
1) First, iray doesn't care about truecolor or grayscale images. It always converts diffuse maps to 32 bits and opacity maps to 16 bits. So a 1024x1024 texture used as diffuse map always takes 4M and the same texture used as opacity map takes 1M.
4M 1024x1024 24 bit truecolor used as diffuse map
1M 1024x1024 24 bit truecolor used as opacity map
4M 1024x1024 16 bit greyscale used as diffuse map
1M 1024x1024 16 bit greyscale used as opacity map
2) Second, iray can't understand when the same texture is used multiple times. So for example in the G3F you have the very same RyJeane_eyes01_1007 texture used for the Sclera, Irises and Pupils surfaces. But it is accounted as 3 separate textures so they take 3x memory.
I don't know if it is possible to change the surfaces directly inside DS. So to have one surface for the whole eye geometry. May be this is possible by editing the model in an external app.
Are you freakin kidding me??? THAT'S SO STUPID.
I had been operating under the assumption that the same file used multiple times didn't use up more space.
ugh
I think the memory use depends on the instancing optimization setting, and it's why they have it. It's normally set to Speed, at the expense of taking up more memory. Try the tests again with a Memory setting, and see what that does. Please post the results if you have the time. Theoretically, this should cause Iray to flatten all the geometry into a single block, which in turn should (might?, sorta?, kinda?) cause a change to how the textures are applied.
Tobor, since I have a very low-end pc I always work with instancing optimization on memory. So unfortunately this is not the issue. Anyway to test it is very easy and quite fast too.
- load the G3F and reset all materials to daz default without textures
- apply just the iris texture and render
- apply the same texture to sclera and render
Now if you check the log you will see that texture memory is doubled in the second render. With or without compression it doesn't matter. That's because the texture is counted twice for two materials, even if it is the very same texture.
I agree with William that it's not the smartest thing a rendering engine can do ..
I ran a simple test, I noticed the GPU doesn't go back to baseline between renders. That is a problem in itself, it shouldn't be holding vram like that. I took two cubes, added iray base surface, then plugged a simple 2048 gradient into the base color. Rendered, then closed studio to flush the vram. Opened the same scene and gave one of the cubes a different 2048 gradient in the base color and rendered.
That doesn't make sense; I can have a number of figures with the same textures in a scene, yet if I'm using different textures, then 4 or 5 is about the max.
It is a while since I noticed this; I wonder if something has changed?
@TheKd Yes I have the same behaviour even working with instancing on memory. It seems once iray takes your vram it doesn't give it back till you kill it LOL .. Anyway this issue doesn't bother me much since I only use one program at a time. That is, when I render with DS I don't render with Blender. Also I have a separate card for the viewport so the Geforce is only for rendering. So iray can hold all the vram it wants I don't care too much ..
@nicstt That is not what I get here. If I render a scene with two characters the texture memory is doubled. Even if the characters are the same.
- load G3F and render
- load another G3F and render
Now if you check the log you will see that in the second render the texture memory is doubled. And of course they use the same texture set.
For the "After Render" screenshot, is this with all render windows closed? We know that, by design, Iray keeps the scene in memory if a render window is still open. By keeping the window open, you can start a new render in much less time, as the scene is already there.
For a more complicated test, I loaded two genesis 3 default females and rendered, saved and closed. Opened, changed them to the same skin set from another character, rendered, saved and closed. Opened again and changed them to two different skin sets from the same vendor, rendered saved and closed. Then I took each saved test, and added NGS2 shaders and repeated the tests again.
For two default genesis 3 females, log reports this:
Geometry memory consumption: 27.4106 MiB (device 0), 0 B (host)
Texture memory consumption: 1.04688 GiB (device 0)
Material measurement memory consumption: 0 B (GPU)
Materials memory consumption: 122.996 KiB (GPU)
Allocated 114.441 MiB for frame buffer
Allocated 1.65625 GiB of work space (2048k active samples in 0.077s)
Two genesis with the same other than default skin set log shows this:
Geometry memory consumption: 29.5474 MiB (device 0), 0 B (host)
Texture memory consumption: 2.32813 GiB (device 0)
Material measurement memory consumption: 0 B (GPU)
Materials memory consumption: 123.996 KiB (GPU)
Allocated 114.441 MiB for frame buffer
Allocated 1.65625 GiB of work space (2048k active samples in 0.077s)
Two Genesis 3 with different skin set:
Geometry memory consumption: 27.4106 MiB (device 0), 0 B (host)
Texture memory consumption: 1.38281 GiB (device 0)
Material measurement memory consumption: 0 B (GPU)
Materials memory consumption: 122.996 KiB (GPU)
Allocated 114.441 MiB for frame buffer
Allocated 1.65625 GiB of work space (2048k active samples in 0.078s)
Both Default Skin NGS2:
Geometry memory consumption: 27.4106 MiB (device 0), 0 B (host)
Texture memory consumption: 3.36719 GiB (device 0)
Material measurement memory consumption: 0 B (GPU)
Materials memory consumption: 125.996 KiB (GPU)
Allocated 114.441 MiB for frame buffer
Allocated 1.65625 GiB of work space (2048k active samples in 0.077s)
Two of the same other skin NGS2:
Geometry memory consumption: 29.5474 MiB (device 0), 0 B (host)
Texture memory consumption: 4.36719 GiB (device 0)
Material measurement memory consumption: 0 B (GPU)
Materials memory consumption: 125.996 KiB (GPU)
Allocated 114.441 MiB for frame buffer
Allocated 1.65625 GiB of work space (2048k active samples in 0.077s)
Ran all the same tests again, once with optix and texture compression normal, another time without optix and texture compression set to 9999, added in a view to show RAM and CPU use to the screen shots.
Test 1 Optix/Compression on
Test 1 Optix/Compression off
Test 2 Optix/Compression on
Test 2 Optix/Compression Off
Test 3 Optix/Compression On
Test 3 Optix/Compression Off
Not really seeing any real consistancy in my tests lol. Especially odd is the largest one, NGS with same non-default skin. The log says texture alone is 4.36719 GB, but GPU-Z says only 2979 MB total being used. It's really weird.
Sorry, yes, that is with render finished, and the popup closed. It seems to always be keeping around 2GB in VRAM according to GPU-Z after all those test renders until studio is closed, then it goes back down again.
Just a note, that the values in the DS log file are PRE-Iray processing (i.e., those are the raw sizes as DS sees them.) This was established in another thread somewhere. So even though DS's log file may say it's 1.2GB of Textures, that is BEFORE Iray compresses/etc. them. They may end up being considerably less. Trust the values reported by GPU-Z and such more.
@hphoenix Sorry I believe you may be wrong. The log reports the values INCLUDING iray processing. You can see a big diffence in the reported values with compression on and off. This is indeed because the log takes compression into account, it would not be possible otherwise.
Iray VERBOSE - CUDA device 0 Processing scene ** this is where the iray scene processing starts, including texture compression
Iray VERBOSE - Geometry memory consumption: 13.7058 MiB
Iray VERBOSE - Texture memory consumption: 2.49219 GiB ** this is texture memory after compression, if it's not disabled
Iray INFO - Scene processed in 16.833s ** this is where the iray scene processing is complete
Just to check it yourself you can do this very simple test
1) load G3F and render with default compression values mid 512 and high 1024
in the log you can see a texture memory of about 350 MB
2) now set mid 8192 and high 16384 and render again
this time in the log you will get about 2.5 GB of texture memory
that's disabling compression, there's a big difference not just nuts
@TheKD I don't get what are you measuring with your tests. You can't measure texture memory with gpu-z since it gives you the overall geometry + texture + iray workspace + framebuffer. If you want to check texture memory you have to use the log.
The log doesn't appear to be accurate though.
Test with two different NGS2 skins illustrates that the clearest. According to the log:
Geometry memory consumption: 29.5474 MiB (device 0), 0 B (host)
Texture memory consumption: 4.36719 GiB (device 0)
Material measurement memory consumption: 0 B (GPU)
Materials memory consumption: 125.996 KiB (GPU)
Allocated 114.441 MiB for frame buffer
Allocated 1.65625 GiB of work space (2048k active samples in 0.077s)
Thats about 6.2 GB VRAM
GPU-Z says the total vram loaded is only 2.979 gb total, less than just the texture load alone reported in the log.
Ran the same test again, 2 g3f with different skins, both with NGS2. This time I set compression to 9999 on both to take compression out of the equation. Log says:
Geometry memory consumption: 27.4106 MiB (device 0), 0 B (host)
Texture memory consumption: 8.50781 GiB (device 0)
Material measurement memory consumption: 0 B (GPU)
Materials memory consumption: 125.996 KiB (GPU)
Scene processed in 6.930s
Allocated 114.441 MiB for frame buffer
Allocated 1.65625 GiB of work space (2048k active samples in 0.000s)
I used both afterburner and GPU-Z this time, to make sure GPU-Z wasn't just plain wrong, they are within 1mb of each other on their reading. So the log is really no good to go by it seems. Log says over 8 gb on textures alone, yet actual usage is way less. I don't really get it.
Some more tests, going to push it on these, optix and compression off for all. I am probably gonna crash the computer haha. Well, so far up to 7 genesis 3 females, with different skins, I think I just proved OP is right! I can't believe it even rendered, I thought I was gonna blue screen for sure the way it locked up while processing the scene. I mean they are naked, and no hair, but still, that is a lot of figures.
4 genesis 3 females
Geometry memory consumption: 54.8208 MiB (device 0), 0 B (host)
Texture memory consumption: 6.90625 GiB (device 0)
Lights memory consumption: 948 B (device 0)
Material measurement memory consumption: 0 B (GPU)
Materials memory consumption: 228.582 KiB (GPU)
Allocated 114.441 MiB for frame buffer
Allocated 1.65625 GiB of work space (2048k active samples in 0.000s)
5 genesis 3 females
Geometry memory consumption: 68.526 MiB (device 0), 0 B (host)
Texture memory consumption: 8.02344 GiB (device 0)
Lights memory consumption: 948 B (device 0)
Material measurement memory consumption: 0 B (GPU)
Materials memory consumption: 281.375 KiB (GPU)
Scene processed in 64.391s
Allocated 114.441 MiB for frame buffer
Allocated 1.65625 GiB of work space (2048k active samples in 0.000s)
6 genesis 3 females
Geometry memory consumption: 82.2311 MiB (device 0), 0 B (host)
Texture memory consumption: 9.70313 GiB (device 0)
Lights memory consumption: 948 B (device 0)
Material measurement memory consumption: 0 B (GPU)
Materials memory consumption: 334.168 KiB (GPU)
Scene processed in 39.698s
Allocated 114.441 MiB for frame buffer
Allocated 1.01342 GiB of work space (1253k active samples in 0.000s)
7 genesis 3 females
Geometry memory consumption: 95.9362 MiB (device 0), 0 B (host)
Texture memory consumption: 11.4453 GiB (device 0)
Lights memory consumption: 948 B (device 0)
Material measurement memory consumption: 0 B (GPU)
Materials memory consumption: 386.961 KiB (GPU)
Scene processed in 49.593s
Allocated 114.441 MiB for frame buffer
Allocated 679.063 MiB of work space (820k active samples in 0.000s)
Ah man, just noticed I forgot to offset the 7th one for the render, gonna have to do it one more time. It worked again, proper head count can be made in this one :D Original render size for all tests is 2000 x 2500 in case anyone is wondering, I had to crop out the bottom for nudity reasons.
One more thing, I don't have some kind of supercomputer here either, here is my rig. It's not horrible, but it's not really anything special either.
@TheKD When making tests it would be better to use the default content so everyone can redo it. I don't have NGS2 so I can't verify your results. Can you please verify those with the standard G3F and let us know ? Also I don't know if using the 9999 value could somewhat "confuse" DS. I would stay with powers of two since these are technical parameters passed to the compressor I feel it's better to avoid translations.
1) default G3F with compression 512-1024 should have 350 MB in the log
2) default G3F with compression 8192-16384 should have 2.5 GB in the log
3) multiple instances of G3F should multiply the texture memory so 3x G3F = 3x memory
As for gpu-z not reporting the same as the log I'm going to investigate myself and let you know what I find out. And I have to stress your tests are very interesting anyway thank you !!
Ok sure, I set the render size to 1000 x 1250 for this round.
One default genesis 3 female, optix off, threshold set at 512 1024
CUDA device 0 (GeForce GTX 1070): Processing scene...
Geometry memory consumption: 13.7055 MiB (device 0), 0 B (host)
Initializing light hierarchy.
Texture memory consumption: 354.251 MiB (device 0)
Light hierarchy initialization took 0.00s
Lights memory consumption: 948 B (device 0)
Material measurement memory consumption: 0 B (GPU)
Materials memory consumption: 70.2031 KiB (GPU)
Scene processed in 12.253s
Allocated 28.6105 MiB for frame buffer
Allocated 1.65625 GiB of work space (2048k active samples in 0.000s)
Received update to 00275 iterations after 37.830s.
Convergence threshold reached.
One default genesis 3 female, optix off, threshold set at 8192 16384
CUDA device 0 (GeForce GTX 1070): Processing scene...
Geometry memory consumption: 13.7055 MiB (device 0), 0 B (host)
Initializing light hierarchy.
Texture memory consumption: 2.49219 GiB (device 0)
Light hierarchy initialization took 0.00s
Lights memory consumption: 948 B (device 0)
Material measurement memory consumption: 0 B (GPU)
Materials memory consumption: 70.2031 KiB (GPU)
Scene processed in 4.548s
Allocated 28.6105 MiB for frame buffer
Allocated 1.65625 GiB of work space (2048k active samples in 0.000s)
Received update to 00257 iterations after 28.940s.
Convergence threshold reached.
2 g3f at same large setting
CUDA device 0 (GeForce GTX 1070): Processing scene...
Geometry memory consumption: 27.4106 MiB (device 0), 0 B (host)
Initializing light hierarchy.
Light hierarchy initialization took 0.00s
Texture memory consumption: 4.98438 GiB (device 0)
Lights memory consumption: 948 B (device 0)
Material measurement memory consumption: 0 B (GPU)
Materials memory consumption: 122.996 KiB (GPU)
Scene processed in 4.665s
Allocated 28.6105 MiB for frame buffer
Allocated 1.65625 GiB of work space (2048k active samples in 0.000s)
Received update to 00342 iterations after 49.124s.
Convergence threshold reached.
3 G3F
CUDA device 0 (GeForce GTX 1070): Processing scene...
Geometry memory consumption: 41.1157 MiB (device 0), 0 B (host)
Initializing light hierarchy.
Texture memory consumption: 7.47656 GiB (device 0)
Light hierarchy initialization took 0.00s
Lights memory consumption: 948 B (device 0)
Material measurement memory consumption: 0 B (GPU)
175.789 KiB (GPU)
Scene processed in 4.913s
Allocated 28.6105 MiB for frame buffer
Allocated 1.65625 GiB of work space (2048k active samples in 0.000s)
Received update to 00347 iterations after 64.432s.
Convergence threshold reached.
4 G3F
CUDA device 0 (GeForce GTX 1070): Processing scene...
Geometry memory consumption: 54.8208 MiB (device 0), 0 B (host)
Initializing light hierarchy.
Texture memory consumption: 9.96875 GiB (device 0)
Light hierarchy initialization took 0.00s
Lights memory consumption: 948 B (device 0)
Material measurement memory consumption: 0 B (GPU)
Materials memory consumption: 228.582 KiB (GPU)
Scene processed in 5.195s
Allocated 28.6105 MiB for frame buffer
Allocated 1.65625 GiB of work space (2048k active samples in 0.000s)
Received update to 00364 iterations after 77.764s.
Convergence threshold reached.
5 G3F
CUDA device 0 (GeForce GTX 1070): Processing scene...
Geometry memory consumption: 68.526 MiB (device 0), 0 B (host)
Initializing light hierarchy.
Texture memory consumption: 12.4609 GiB (device 0)
Light hierarchy initialization took 0.00s
Lights memory consumption: 948 B (device 0)
Material measurement memory consumption: 0 B (GPU)
281.375 KiB (GPU)
Scene processed in 5.584s
Allocated 28.6105 MiB for frame buffer
Allocated 1.65625 GiB of work space (2048k active samples in 0.000s)
Received update to 00363 iterations after 91.393s.
Convergence threshold reached.
6 G3F
CUDA device 0 (GeForce GTX 1070): Processing scene...
Geometry memory consumption: 82.2311 MiB (device 0), 0 B (host)
Initializing light hierarchy.
Texture memory consumption: 14.9531 GiB (device 0)
Light hierarchy initialization took 0.00s
Lights memory consumption: 948 B (device 0)
Material measurement memory consumption: 0 B (GPU)
Materials memory consumption: 334.168 KiB (GPU)
Scene processed in 5.768s
Allocated 28.6105 MiB for frame buffer
Allocated 1.65625 GiB of work space (2048k active samples in 0.000s)
Received update to 00380 iterations after 101.957s.
Convergence threshold reached.
7 G3F
CUDA device 0 (GeForce GTX 1070): Processing scene...
Geometry memory consumption: 95.9362 MiB (device 0), 0 B (host)
Initializing light hierarchy.
Texture memory consumption: 17.4453 GiB (device 0)
Light hierarchy initialization took 0.00s
Lights memory consumption: 948 B (device 0)
Material measurement memory consumption: 0 B (GPU)
Materials memory consumption: 386.961 KiB (GPU)
Scene processed in 6.088s
Allocated 28.6105 MiB for frame buffer
Allocated 1.65625 GiB of work space (2048k active samples in 0.000s)
Received update to 00372 iterations after 113.202s.
Convergence threshold reached.
Out of time to play for now, but these results seem super weird. Take the last one for example, it reports texture using up 17gb, my machine only has 16bg(8x2)........ And GPU is only using 3.5gb, yet it's still rendering. What voodoo is this, I expected my GPU to be maxed like last round of test, but it wasn't when all using same default texture. This is a really interesting discovery you stumble on Padone, well done.
Oh, noticed one other thing, frame buffer and workspace always the exact same number, and takes 0.000 seconds to do, maybe that is what is holding the vram hostage until close? Oh yeah, the round of tests I did before was also without NGS2, but was using non-free skin sets.
@TheKD Thank you for running all those tests! They kind of bear out what was tested (though not so thoroughly!) in the prior thread I mentioned.
I think DS is 'accumulating' Geometry, Texture, etc, memory size AS it passes it through the CUDA interface to the nVidia card. In other words, it's basing it's 'size' on the UNCOMPRESSED raw size in memory as it is loaded from file. Which can be MUCH much larger than it's final size (we don't keep bitmaps or JPGs in memory in raw RGBA32 format.) So it's accumulating those huge values, which are very innaccurate, and it's doing it for each and every file load (which may/may not be even utilized by Iray, depending on whether or not memory optimization is set) and before Iray gets it and compresses it (or not) based on the Compression settings.
Which is why the DS Log file can report 9+ GB of texture memory used, then it renders on a single 6GB card without falling to CPU.
Did some tests, pretty much same results as TheKD .. now seems clear to me that the log is crap .. The good news are it seems it's the log that can't understand when textures are the same. While iray does understand it fine. The bad news are with this data I don't believe anymore that iray can use the shared memory. It was all bad data in the log.
COMPRESSION ON log vram ram
1x G3F 350M 208M 1.4G
4x G3F 1.4G 267M 1.9G
Barefoot Dancer 1.4G 702M 4.9G
COMPRESSION OFF log vram ram
1x G3F 2.5G 767M 2.9G
4x G3F 9.9G 831M 3.5G
Barefoot Dancer FAIL FAIL 6.4G CPU REVERSING
Now if the DS team could fix this bug please ..
p.s. @hphoenix I believe the log takes into account compression because it delivers very different results with compression on/off. The issue seems to be it doesn't understand when textures are the same so it gives out wrong data.
Either the log is wrong, or maybe it calculates them all as seperate maps first, then after the calculation it realizes, that a lot of these maps are the same and readjusts. Something is going on for sure, and it makes it harder for us users to optimize our scenes and make sure it's going to render in gpu. I take a few characters in one simple scene, all goes well and renders fast. Take those characters to another simple scene and it kicks to cpu all the time or takes forever to render. It's hit or miss a lot for me, and it does get frustrating. The main reason why I did these tests was to take a break from another scene rendering that was driving me crazy lol.
Another thing I remembered though, in my g3f test with different skins, 4-7 should not have been able to render on my card because it only has 8gb vram but it did. That cannot be explained by the duplicate maps not being calculated correctly in log file.
The test you're referring to has 11.4G on the log and 7.3G in gpu-z, compression off. So I don't get why you say it shouldn't render. 11.4G is bad data, and 7.3G fits fine in the card. I'm sure if you do the same test with compression on it will render anyway, even using less ram/vram.
I definitely agree with you that this bug is very annoying. Having a good measure of texture memory is essential to plan renderings. May be in this case memory assistant could help https://www.daz3d.com/iray-memory-assistant but I also hope DAZ will fix the log.
edit: May be you take 11.4G as good because of the different skins. Unfortunatlely even in this case the log data is bad. If a single texture is used for multiple materials it is counted multiple times by the log, while it's a single texture in iray. This happens for example in the G3F eyes that use the same texture for Pupil, Iris and Sclera materials. See my 4th post.
Ah right, it's still a lot of maps per character(I counted one random one, it had 22 4096 maps and 5 2048 maps), but some surfaces are using the same maps over. I remember reading the latest update made that memory assistant script way less accurate, so I stopped using it. New round, taking default genesis 3, putting on a simple mapless shader to override all maps.
1 4k map texture map
Texture memory consumption: 64.001 MiB (device 0)
1 8k texture map
Texture memory consumption: 256.001 MiB
1 16k map
Texture memory consumption: 1 GiB
1 32k map
Texture memory consumption: 320.001 MiB
That can't be right, I think it must have just dropped the map for being far too large. So gonna stick with all 16k maps then. Plug in 9 different 16k maps, if the using ram theory is right, it will render, if wrong, will drop to CPU. I am not sure if it will even work though, so starting with 4 16k maps and going from there. After doing the first test, I realized that I cannot set the compression thresholds high enough, so that test method is no good. :(
Ok, new try, can fit 17 8k maps onto color channel on one default genesis, turned compression threshold over 8k map size.
one genesis with all different maps makes Texture memory consumption: 4.25 GiB.
Ah darn two genesis 3 with 17 different 8k maps each fell back to CPU. Booooooooooooooooooooooooo! Hisssssssssssssssssssssssssssss!
Also, 2 duplicate genesis with the same 17 8k maps rendered fine in GPU reporting Texture memory consumption: 8.5 GiB. It seems maybe the verbose reporting for log is just not verbose enough, it must optimize after that and drop the dupe maps or something. Back to square one and blundering in the darkness of the VRAM lol. Man, that sucks lol, I thought you made the iray discovery of the century :P
Yes, that would have been cool LOL .. I did similar tests as you and hit cpu reversing as soon as I was trying to go outside the vram. At least together we found out how the log and compression work so there's still something good anyway. Thank you very much for your help and tests !!