Myths and Memory Issues with IRay in Daz 4.8
Spent about 30 hours doing test renders over the weekend in IRay and learned a few interesting things I think.
First, the common wisdom is that when you try to render a scene that is larger than the VRAM on your Nvidia card can handle the card will "fail over", and dump the whole job out onto your CPU. That is easily proved to be incorrect. All you have to do is use the advanced settings tab in the render settings to force the render to run on your graphics card only. No matter what size your scene if you watch in Task Manager you will see that the render calculation load never transfers to the CPU. The GPU still does the processing. What DOES happen is that the scene calculation data (some sort of massive array I assume) gets loaded into your system RAM when it doesn't fit into the VRAM. No way to test if it's using the video card VRAM and system RAM at the same time, but the system RAM is definitely 100% tasked to support the calculations.
This is where the argument that tasking both your CPU and GPU to work on a render actually slows the render comes in. There is merit to that. What I noticed was that forcing the GPU to do the heavy lifting seems to free up the CPU to do the "housekeeping", of managing breaking up the array and sharing it across the VRAM, RAM and even the hard drive.
Yes. I said the hard drive. Here is the really interesting part.
In my experience an extremely optimized (all good quality IRay shaders) scene with no Genesis figure can render in minutes on an Nvidia card with even 2 GB of VRAM. Adding ONE Genesis 3 figure to the scene boosts the amount of memory needed to to somewhere around 4 GB to 6 GB of RAM. It can still be rendered in a few minutes if you can keep the total load under 8 GB of RAM and leave your CPU free to manage shuttling the math iterations in and out of the system RAM.
In my case my system has only 8 GB of RAM. Adding a second Genesis 3 figure to the scene - or even one Genesis 3 figure with poorly selected shaders - runs past the 8 GB of RAM and forces the CPU to start paging the math iterations to the hard drive. THAT is when IRay grinds almost to a halt. In fact, if you try to render a scene with 4 Genesis 3 figures on a machine with only 8 GB of RAM the CPU will become overwhelmed trying to balance the math iterations and your system will most likely freeze or crash. Optix Acceleration, by the way, increases the memory demand quite a bit and causes your system to crash much more easily. It WILL speed up a very tiny, optimized render.
So...my take on this is that your first priority should not be a graphics card with 12 GB of VRAM. That money would be much better spent on the fastest and biggest SSD you can afford. That should give you a dramatic boost in render speed when a larger render starts paging to the hard drive. Next priority would be as much system RAM as you can stuff into your box. Last priority would be the biggest and most expensive Nvidia graphics card you can afford.
Please note - this is all related to Daz 4.8. From what I've read the IRay implementation in 4.9 beta is being refined quite a bit. But in any case, doing these upgrades wouldn't hurt.
Comments
That's good to hear. Now that I'm doing more 3DL again I've been rethinking my strategies for my next computer.
And the biggest lack in my present system is no SSD.
Sorry, mate, my experience is in direct contradiction to yours.
When my video card doesn't have enough memory to hold the scene, it does NOTHING and the scene is rendered via CPU (as evidenced by my monitoring utility which clearly shows the GPU dropping to a meagre 5-10% while all 8 cores of my CPU chug away at 100%). And this with CPU unchecked in the Advanced section.
And I believe the difference you are observing between Genesis 2 and Genesis 3 is the fact the majority of Genesis 3 models come with more and larger textures than the standard Genesis 2 and this is what takes up all the available memory.
My test also contradict your findings. With GPU being the only thing checked on photoreal and interactive, anything over the gpu vram limit causes the renderer to stop with a blank window. Logs show something like, GPU1 failed, disabling---no other suitable render devices--aborting render. That's with CPU disabled.
One question...what video card?
I have seen both behaviors: a scene with 2ea v4s, 2ea G2s, and 1ea G3 plus clothes etc takes forever to prepare the graphics card for rendering, due to all the swapping and hundreds of hard faults/second. Once done it has successfully started the graphics card. I noticed that when I leave that render open without saving, it doesn't release the commit charge or memory in use. Presumably because it is staying prepared to resume the render. And indeed, when I resume the render it starts up pretty quickly. Because I love trouble, I started another big render of the same scene without closing the first. The system nearly froze, but I was able to limp the mouse over to the first render and close it. I saw the commit charge and the memory usage drop, but then noticed the card wasn't working on the first render anymore. Something in the released memory knocked the legs out from under the second render. By canceling the render and then resuming, the card was back to work.
I have a GEForce TitanX and 16GB of RAM. I'm running Daz 4.8. The SSD may be my next purchase, I'm maxed everywhere else and it's not enough. Thanks for that tip tring01. I also didn't know Optix accleration takes more memory, that is my next experiment...
Wow, that's amazingly useful information. I wish I had seen this before I bought my 980 ti... which I just placed an order for about 24 hours ago.
So you've stumbled on something even Nvidia doesnt' know about and they wrote all the code for this engine.
I'd let them know and see if a response is offered.
See this post by prixat in another thread...
http://www.daz3d.com/forums/discussion/comment/989985/#Comment_989985
That should pretty much put the end to this myth.
I was hoping it was magic software :(
The only possible way, that I can see that it doing out of coure usage is on certain mobile chipsets...maybe.
Don't worry, the OP's conclusions are nearly as universal as presented. Even based on the info, buying a big SSD as the first priority is a poor decision. Maxing your system RAM is a much better and cheaper choice, will avoid paging to HD. To add to Statdragon's subtle remark, apprently Nvidia didn't know this (not). If you want a fast Iray render, the basic advice hasn't changed for months, > > > best hardware you can afford (most CUDA cores/VRAM in GPU, most system RAM, fastest CPU ). This will also help everything else you do.
Finally, when discussing render speed, folks seem to ignore that Iray render speed are very dependent on the Artists hardware, AND the scene.
I actually already have maxed system memory and a very good solid state. I would have saved money and gone with a 970, but really blew my budget for the month b/c I thought the GPU wouldn't work if the memory was too low (It's 4gb on 970, and 6gb on 980ti)
You seem to be missing the point of the subsequent posts. The OP is woefully incorrect in his observations. He is patently wrong. The correct advice is still: lots of CUDA cores and as much VRAM as you can afford.
That is in addition to the fact that Nvidia says what was described is not possible...yet.
my concern with SSD is any fall over to the HD will require paging. AFAICT any amount of physical RAM does not prevent this and if SSD's have a finite R/W/X lifespan what is happening to the SSD long term?
This is an IT grey area for me so I'm not saying this does happen, but I think there is some to concern to be had with it as to the level I'm not clear.
Not sure how you can be so definite about that. If you look at the earlier posts some others have experienced what I did. There is apparently something else at play here that causes the GPU to participate in some cases and not in others. I don't have a variety of systems to test on so I cannot explain why, but I can assure you that my Nvidia GPU DOES handle the render processing no matter the size of the scene if I have only it checked under the Photoreal tab in the Advanced tab.
That said - whether the GPU or CPU is doing the processing is only a secondary issue with to my main point. My main point is that scenes with multiple G3 figures in them are consuming anywhere from 4GB to 6GB per figure. In other words, they consume a surprisingly huge amount of RAM during the render. If you have a machine with 64GB or 128GB of RAM you may not see paging to the hard drive - but many users are hobbyists like me running on machines with 8GB or 16GB of system RAM. For someone with a system like mine, paging to the hard drive during the render is INEVITABLE - no matter if the CPU or GPU is doing the math. In this case I feel one is far better off putting $200 into a fast SSD if that's all they have to invest. They often cannot expand their system RAM significantly without upgrading to a much more expensive motherboard, etc. Upgrading to a 980 Ti for $700 will not help them as much as a $200 SSD because their first performance bottleneck is memory - since any render with two or more G3 figures WILL fail over and not use the VRAM and thus will have to page to the hard drive.
I think we're all talking the same reality, but from different points of view. I probably should have described this post something like, "What is the best way to spend $200 to speed up your IRay renders", or something along those lines.
Very interesting. I only have one machine with an Nvidia graphics card and it definitely does use the GPU to do the calculations even if the VRAM is insufficient (with 2GB of VRAM, that is almost every render). In my case, where every render pages to the hard drive, I'm far better off investing a limited budget in a fast SSD. Everyone isn't in the same situation as I am.
In windows, you can set your swap space to zero, if you have sufficient memory. It isn't recommended, as if you overflow your memory, things will start crashing (since even system processes may not be able to allocate new memory.) But it can be done, and if you have 32-64GB of RAM, and you keep an eye on it (not just simple memory, but actual used/reserved/free, and know you won't get close to running out, you can do it.
SSDs should NEVER be used for swap files, though....the constant R/W cycles will reduce their lifespan considerably.
Really? I would agree paging to the hard drive is a major use of the effective life of any hard drive. Since it is univerally done and inevitable on any normal Windows machine, do you know if SSD's fare better or worse than mechanical hard drives at handiing paging? If SSD's are demonstrably much more delicate than mechanical hard drives I would agree with you. I've never seen anyone state that anywhere though. In fact, most claims I seen say that SSD's a much more durable than mechanical hard drives. Is that incorrect?
Because Nvidia has repeatedly said the capability to do that DOES NOT EXIST...at this time!
Well, the fact that you ignored everyone else here who said exactly the same thing I did kind of bothers me. I see one person who agrees with you, sort of, and a list of people who are telling you that you are mistaken in your observations.
And I did give you a valid reason as to why Genesis 3 takes up so much more RAM: all the texture maps that are included with G3. With G2 you have diffuse, specular, bump and maybe a normal map. With G3 you have: diffuse, specular, bump, translucency and normal. That's the new standard: including all 4 types of maps. And BIG maps at that.
Actually, what Nvidia responded to this question as recently as yesterday was:
"Thanks for the request. Various means of managing out of core memory are being examined, but there's nothing to announce at this time.
Please note that this situation is more critical for Octane in that it's purely GPU, whereas Iray will simply fallback go using the CPU(s) when GPU memory is exceeded.
- Phil"
Read carefully what the Nvidia rep said and was careful not to say above. Realize he has to avoid any chance of a lawsuit by only putting out the most conservative statement of IRay's capabilities that is "legally", correct.
If you read what the NVidia rep said carefully you will see is NOT saying the GPU is not used in a fallback situation, but only that the CPU IS used. That is entirely consistent with what I observed and posted. If I enabled CPU only on a larger scene it used 100% of my CPU. If I enabled GPU only it used only about 25% of my CPU and rendered faster. Seems obvious to me that there is not a black and white situation here. What I observed is not inconsistent with what Nvidia has said and is hinting about. It is also not inconsistent with what other posters on this thread have observed.
In any case, the issue I'm talking about is memory usage - not which processor is doing the math. I stand by my post.
Iray will fall back to cpu *IF* you have cpu also checked. It won't fall back to cpu if cpu is uncheded. It will simply not render anything. Regarding your advise to buy an SSD over RAM, that is not a good idea. RAM is cheaper, faster than an SSD, and will prevent paging to disk in the first place.
Nvidia can't use 'out of core' memory...YET.
It's not a matter of what legaleze is being said...
What I've read is the older SSD have a 20TB write lifespan which sounds like quite a bit but in a swap situation that's not much at all, newer SSD's do far better. the flash memory cells wear out on any flash memory device so it's variable, but it's not out of the question for a vendor to bundle an old set of chips in a new box an clean out the warehouse and get rid of overstock.
It's this difference precisely that makes SSDs in general bad for swap-file use. Read cycles on SSD are virtually free, wear wise. But Write cycles on Flash memory wear out the cells much more significantly. Swap files are used heavily in Write/Read/ReWrite cycles. Standard HDDs handle the constant writing much better. This is why all the OS install guides will recommend moving your swap file to a non-SSD drive.
I'm not sure where you get the '20TB' write lifespan, as flash memory lifespans are measured in number of write cycles, not a memory size......
No it doesn't.
The GPU cores in a nVidia card aren't only used for Iray. They might shuttle data, for example, and may be active during routine display adapter shading events, particularly true if your card is connected to a monitor. The purpose and use of Cuda cores goes far beyond Iray, as they are essentially generic floating point processors. Just sitting doing nothing, not even moving the mouse, the GT620 connected to my monitor will regularly ramp up and down from 0, to 10+% GPU load. This in itself means nothing.
Doing accurate benchmarks requires a little more than simplistic timing. To get an accurate assement, you must make sure the scene database fully unloads from RAM. Otherwise, subsequent renders could appear to be faster, even when the rendering speed is dimished. Scene building and rendering are two distinct phases; the first takes place in the CPU using system RAM, and may be cached in subsequent renders.
How RAM is used for basic display housekeeping is irrelevent, and different nVidia drivers may in fact cooperatively share system and VRAM for certain tasks. This has nothing to do with Iray, the subject of the thread you started.
Check the ssd endurance experiment. Bottom line, you'll be scrapping out your system before your ssd dies. The low-end drives were showing signs of failure at 200 TB of writes; two went beyond 1.5 petabytes of writes. Full synopsis and pointer to the start of the testing here. http://techreport.com/review/27909/the-ssd-endurance-experiment-theyre-all-dead
All you're seeing here is that scene/memory management is always done by the CPU (a single thread), even when Iray is set to GPU only.
The reason why GPU only mode is sometimes faster is that load balancing is not possible in mixed modes and you get less efficient rendering overall.
Rendering to the pagefile is not inevitable, because I have no pagefile at all. :P
However, My scenes can not go over 14GB, or Daz crashes. (16GB of RAM, no nVidia Card on this computer.)
So, the only thing inevitable, is the fact that it will "try" to use pagefile, in the event of going over physical RAM, and/or VRAM. But that is symantics.
Great info and interesting observation.
P.S. I use an SSD for everything. It has about 3-million writes, per block, but it does not write every time. Old SSD's had 100,000 cycles, years ago. New SSD's from 2007 and up, are about 1-5 million cycles, and more. If the data is a 1 and it needs to write a 1, it skips it, essentially doubling write-potential. When it does write, it does not write over the same block that something is "set to be erased from", it moves to a less-used block, then writes there. (Thus, fragmentation occurs. But is nearly irrelevant with SSD's, due to the seek speeds being nearly as fast as contiguious reading speeds, for split files.)
That is why you don't ever defragment SSD's, it breaks the "did I write here yet", and "how many times did I write here", counts. That destroys SSD's, and that is why windows, by default, no longer does defragmentation on SSD's. (Though, it is good to do, once windows does a big update, or after installing things. Because it makes them load a fraction of a second longer... when 1.5 seconds is too long, 1.4 second loading times will make all the diference in the world! {That was a sarcastic joke})
Even under heavy writing loads.. you are talking about 20-50 years of life for an SSD. It out-weighs the 5-10 years that regular HD's live, before the surfaces start to corrode on the platters. (Looks like spider-webs when you take them apart and look at them. Cheap chinese surface-blasting bonds that corrode, due to that little "bleader hole" in all drives.)