Apache OpenOffice (AOO) Bugzilla – Issue 9277
OOo reacts slow when system is under load with lower priority then OOo
Last modified: 2003-09-08 12:00:08 UTC
Hi, I run a distributed computing client on my linux machine (www.distributed.net) and when this is one OOo is very slow. Probably the problem is that OOo looks at the system and sees it is under 100% load and doesn't try to get so much CPU-power. However the distributed.net client runs at lowest priority so any other program will get the CPU-power first if it needs it. All other programs run fine with distributed.net. Thanks in advance
10:38 < sethbc> haggai: i get a lot of bug reports that when a system is under heavy load 10:38 < sethbc> haggai: openoffice runs like crap 10:38 < sethbc> haggai: today, i got a pretty interesting one 10:38 < sethbc> i'm talkin doesn't pick up keystrokes for a while 10:39 < sethbc> Gentoo is very compile based, so system load can be high a lot of the time 10:39 < sethbc> i can replicate this, and i've seen a couple things in IZ about it 10:39 <@haggai> oh,right. Yes, I've seen that sort of stuff and always assumed the machine was simply busy swapping 10:41 < sethbc> haggai: thats not it i don't think, one of my reports mentions a schedule call 10:43 < sethbc> sched_yield 10:45 < sethbc> let me find some IZ numbers, but basically, the gist is that sched_yield is totally different now 10:46 < sethbc> and that under heavy load, calling sched_yield will sleep a process almost indefinately (becuase of the new O(1) scheduler) 10:47 < sethbc> O(1) is used on 2.5, and in some 2.4 implementations (i think its mainstream, need to check) 10:49 < sethbc> so...and i know this is a HUGE change...most of the sched_yield calls should be changed 10:50 < sethbc> http://www.codemonkey.org.uk/post-halloween-2.5.txt 10:50 < sethbc> ^^ shows the differences in the clals 10:50 < sethbc> basically, i run 2.5, and if i don't nice my compiles, i don't write a document 10:52 < sethbc> haggai: the part where he discusses sched_yield, he specifically mentioned openoffice .... and in this case, any press isn't good press =) 10:52 < sethbc> yeah, i know you can't just do a find and replace on sched_yield, but...its a problem 11:18 < sethbc> haggai: it looks like sched_yield is defined to pthread_yield http://www.codemonkey.org.uk/post-halloween-2.5.txt contians this section: Process scheduler improvements. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Another much talked about feature. Ingo Molnar reworked the process scheduler to use an O(1) algorithm. In operation, you should notice no changes with low loads, and increased scalability with large numbers of processes, especially on large SMP systems. - Robert Love wrote various utilities for changing behaviour of the scheduler (binding processes to CPUs etc). You can find these tools at http://tech9.net/rml/schedutils - The behavior of sched_yield() changed a lot. A task that uses this system call should now expect to sleep for possibly a very long time. Tasks that do not really desire to give up the processor for a while should probably not make heavy use of this function. Unfortunately, some GUI programs (like Open Office) do make excessive use of this call and under load their performance is poor. It seems this new 2.5 behavior is optimal but some user-space applications may need fixing. - The above applies to use of yield() in the kernel, too. - 2.5 adds system calls for manipulating a task's processor affinity: sched_getaffinity() and sched_setaffinity() - Regressions to mingo@redhat.com and rml@tech9.net So it looks to me that we must find a fix for this on both the 1.0 and 1.1 branches - as this will affect more and more users as time goes on.
*** Issue 10212 has been marked as a duplicate of this issue. ***
*** Issue 9970 has been marked as a duplicate of this issue. ***
TM->MHU: Can you please have a look ? JA told me, that you might be the one, who would investigate this problem. Thanks !
Hi Thorsten, Chris, Yes, I have been investigating into this (frequent yield() and poll() or select() calls), and I did work with the 'gsl' team on improving this (and will probably continue to do so). The good news is that the current OOo 1.1 Beta build, as well as the upcoming OOo 1.0.3 build, is already much improved in this respect. Grep over the 'vcl' cvs archive for files committed with (Sun internal) bug ids '#107292#' and '#104559#', in particular 'vcl/unx/source/app/saldata.cxx'. The bad news is that there still some 'sched_yield()' calls left for situations where 'vcl' doesn't do any other interruptable call. The reason can been seen from a comment in 'sal/osl/unx/thread.c', where the used function 'osl_yieldThread()' is implemented. That comment basically states that POSIX requires a thread that does not otherwise block (sleep, blocking I/O) needs to call 'sched_yield()'. This might nowadays be somehow historical, but ensures that OOo can still run with a 'user-land' / 'green' threads implementation, and not just with 'native' threads. Unless we are willing to raise OOo's system requirements to 'native' threads (and this may well limit the portability) I'm not going to simply removing those 'sched_yield()' calls. On the other hand, I agree that there may be too many such calls, even after the above mentioned changes. Further changes to eliminate them may well require some changes in 'vcl' application logic. Should someone come up with evidence to show that this logic is wrong, I'm willing to review that evidence (and patches) and help incorporating them into the code. Hope that helps, Matthias BTW: the proper target would be 'OOo 2.0' instead of 'OOo Later', at least IMHO.
...and the appropriate project would be 'gsl' / 'code'. Thus changing project and target milestone...
Grief; I just hit this, and it's amazingly bad; with a stock 1.0.3.1 - I get > 300 sched_yields to do a simple re-display, taking ~100ms each that's tens of seconds on RH 9.0 I've added a: #ifndef LINUX sched_yield(); #endif which sucks really; but ... Would people object to hard-coding such platform specific knowledge of thread greenness into the mainline ?
Adding comments from duplicate issue 14104: <14104> Open Office has always had this defect: when the processor is loaded, it runs slow. But with the new OS that I have installed (SuSE Linux 8.2), and with versions of OpenOffice 1.0.2 from 1.1.Beta, this becomes unacceptable. When a program occupies 85-97% of the CPU (no matter what % of memory), the OpenOffice is extremely slow and, generally, unusable. I wonder whether there is a solution. The problem seems to become more and more serious as the OS-s and gcc versions progress. Other similar applications, like koffice, Netscape do not display noticeable slowdown. ------- Additional Comments From Matthias Huetsch 2003-05-06 12:24 PDT ------- After upgrading my SuSE 8.1 to SuSE 8.2, I can only confirm this issue. And yes, it is indeed bad with even repaints delayed for more than a minute. Thanks for reporting this, even if it is duplicate to issue 9277. I will add this description to issue 9277, and take care that it will be fixed for OOo 1.1 at least. </14104>
Having now seen the effect myself (due to issue 14104) I think this cannot wait for OOo 2.0, but needs to be addressed for OOo 1.1 at least, if not for OOo 1.0.4. Retargeting to OOo 1.1 RC...
*** Issue 14104 has been marked as a duplicate of this issue. ***
fixed in vcl10, review pending
Hi Christof, Reviewed changes in vcl/unx/source/app/saldata.cxx, r1.23.28.1 -> offending call to osl_yieldThread() removed -> okay. Tested cws_srx644_vcl10 build (unxlngi5.pro) on SuSE 8.2 -> application now responsive, even under 99% (user) CPU load -> okay. As we evaluated some time ago, 'dtrans/source/X11/X11_selection.cxx' uses osl_yieldThread() in two methods: * one doubtful call in 'SelectionManager::getPasteData()' * two redundant call in 'SelectionManager::dragDoDispatch()' The first one should at least be evaluated, but the second one should really be fixed. Otherwise I wouldn't regard this issue as fixed. Matthias
dtrans/source/X11/X11_selection.cxx, r1.61.4.1 now also removes the last calls to osl_yieldThread() -> reviewed okay -> verified.
closing this one to get it out of the way for rc cp->tom ribbens: you'll need to wait for the next OOo build to harvest the fruits of this bug. Please feel free to reopen it if you think the fix doesn't suit your needs.
So this is where all the action happened! :) I filed this as bug 7139 a while ago but I guess it slipped under the rug. Shall I mark it a duplicate of this one?
If you think it's the same bug, sure.
*** Issue 7139 has been marked as a duplicate of this issue. ***
*** Issue 18822 has been marked as a duplicate of this issue. ***