Issue 9277 - OOo reacts slow when system is under load with lower priority then OOo
Summary: OOo reacts slow when system is under load with lower priority then OOo
Status: CLOSED FIXED
Alias: None
Product: gsl
Classification: Code
Component: code (show other issues)
Version: OOo 1.0.2
Hardware: PC Linux, all
: P2 Trivial (vote)
Target Milestone: OOo 1.1 RC
Assignee: christof.pintaske
QA Contact: issues@gsl
URL:
Keywords:
: 7139 9970 10212 11204 14104 18822 (view as issue list)
Depends on:
Blocks:
 
Reported: 2002-11-16 10:59 UTC by kosmo123
Modified: 2003-09-08 12:00 UTC (History)
4 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description kosmo123 2002-11-16 10:59:22 UTC
Hi,

I run a distributed computing client on my linux machine (www.distributed.net) 
and when this is one OOo is very slow. Probably the problem is that OOo looks 
at the system and sees it is under 100% load and doesn't try to get so much 
CPU-power. However the distributed.net client runs at lowest priority so any 
other program will get the CPU-power first if it needs it. All other programs 
run fine with distributed.net.

Thanks in advance
Comment 1 chris 2003-03-26 11:33:45 UTC
10:38 < sethbc> haggai: i get a lot of bug reports that when a system
is under heavy load
10:38 < sethbc> haggai: openoffice runs like crap
10:38 < sethbc> haggai: today, i got a pretty interesting one
10:38 < sethbc> i'm talkin doesn't pick up keystrokes for a while
10:39 < sethbc> Gentoo is very compile based, so system load can be
high a lot of the time
10:39 < sethbc> i can replicate this, and i've seen a couple things in
IZ about it
10:39 <@haggai> oh,right.  Yes, I've seen that sort of stuff and
always assumed the machine was simply busy swapping
10:41 < sethbc> haggai: thats not it i don't think, one of my reports
mentions a schedule call
10:43 < sethbc> sched_yield
10:45 < sethbc> let me find some IZ numbers, but basically, the gist
is that sched_yield is totally different now
10:46 < sethbc> and that under heavy load, calling sched_yield will
sleep a process almost indefinately (becuase of the new O(1) scheduler)
10:47 < sethbc> O(1) is used on 2.5, and in some 2.4 implementations
(i think its mainstream, need to check)
10:49 < sethbc> so...and i know this is a HUGE change...most of the
sched_yield calls should be changed
10:50 < sethbc> http://www.codemonkey.org.uk/post-halloween-2.5.txt
10:50 < sethbc> ^^ shows the differences in the clals
10:50 < sethbc> basically, i run 2.5, and if i don't nice my compiles,
i don't write a document
10:52 < sethbc> haggai: the part where he discusses sched_yield, he
specifically mentioned openoffice .... and in this case, any press
 isn't good press =)
10:52 < sethbc> yeah, i know you can't just do a find and replace on
sched_yield, but...its a problem
11:18 < sethbc> haggai: it looks like sched_yield is defined to
pthread_yield

http://www.codemonkey.org.uk/post-halloween-2.5.txt contians this section:

Process scheduler improvements.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Another much talked about feature. Ingo Molnar reworked the process
  scheduler to use an O(1) algorithm.  In operation, you should notice
  no changes with low loads, and increased scalability with large numbers
  of processes, especially on large SMP systems.
- Robert Love wrote various utilities for changing behaviour of the
  scheduler (binding processes to CPUs etc). You can find these tools at
  http://tech9.net/rml/schedutils
- The behavior of sched_yield() changed a lot.  A task that uses
  this system call should now expect to sleep for possibly a very
  long time.  Tasks that do not really desire to give up the
  processor for a while should probably not make heavy use of this
  function.  Unfortunately, some GUI programs (like Open Office)
  do make excessive use of this call and under load their
  performance is poor.  It seems this new 2.5 behavior is optimal
  but some user-space applications may need fixing.
- The above applies to use of yield() in the kernel, too.
- 2.5 adds system calls for manipulating a task's processor
  affinity: sched_getaffinity() and sched_setaffinity()
- Regressions to mingo@redhat.com and rml@tech9.net

So it looks to me that we must find a fix for this on both the 1.0 and
1.1 branches - as this will affect more and more users as time goes on.
Comment 2 chris 2003-03-26 11:33:49 UTC
*** Issue 10212 has been marked as a duplicate of this issue. ***
Comment 3 chris 2003-03-26 11:36:13 UTC
*** Issue 9970 has been marked as a duplicate of this issue. ***
Comment 4 thorsten.martens 2003-04-07 10:13:07 UTC
TM->MHU: Can you please have a look ? JA told me, that you might be
the one, who would investigate this problem. Thanks !
Comment 5 matthias.huetsch 2003-04-07 20:01:45 UTC
Hi Thorsten, Chris,

Yes, I have been investigating into this (frequent yield() and poll()
or select() calls), and I did work with the 'gsl' team on improving
this (and will probably continue to do so).

The good news is that the current OOo 1.1 Beta build, as well as the
upcoming OOo 1.0.3 build, is already much improved in this respect.
Grep over the 'vcl' cvs archive for files committed with (Sun
internal) bug ids '#107292#' and '#104559#', in particular
'vcl/unx/source/app/saldata.cxx'.

The bad news is that there still some 'sched_yield()' calls left for
situations where 'vcl' doesn't do any other interruptable call. The
reason can been seen from a comment in 'sal/osl/unx/thread.c', where
the used function 'osl_yieldThread()' is implemented. That comment
basically states that POSIX requires a thread that does not otherwise
block (sleep, blocking I/O) needs to call 'sched_yield()'.

This might nowadays be somehow historical, but ensures that OOo can
still run with a 'user-land' / 'green' threads implementation, and
not just with 'native' threads.

Unless we are willing to raise OOo's system requirements to 'native'
threads (and this may well limit the portability) I'm not going to
simply removing those 'sched_yield()' calls.

On the other hand, I agree that there may be too many such calls,
even after the above mentioned changes. Further changes to eliminate
them may well require some changes in 'vcl' application logic.

Should someone come up with evidence to show that this logic is wrong,
I'm willing to review that evidence (and patches) and help
incorporating them into the code.

Hope that helps,
Matthias

BTW: the proper target would be 'OOo 2.0' instead of 'OOo Later', at
least IMHO.
Comment 6 matthias.huetsch 2003-04-07 20:07:11 UTC
...and the appropriate project would be 'gsl' / 'code'.

Thus changing project and target milestone...
Comment 7 mmeeks 2003-04-29 13:23:24 UTC
Grief; I just hit this, and it's amazingly bad; with a stock 1.0.3.1 -
I get > 300 sched_yields to do a simple re-display, taking ~100ms each
that's tens of seconds on RH 9.0

I've added a:

#ifndef LINUX
    sched_yield();
#endif

which sucks really; but ... Would people object to hard-coding such
platform specific knowledge of thread greenness into the mainline ?
Comment 8 matthias.huetsch 2003-05-06 20:29:56 UTC
Adding comments from duplicate issue 14104:

<14104>

Open Office has always had this defect: when the processor is loaded,
it runs slow. But with the new OS that I have installed (SuSE Linux
8.2), and with versions of OpenOffice 1.0.2 from 1.1.Beta, this
becomes unacceptable. When a program occupies 85-97% of the CPU (no
matter what  % of memory), the OpenOffice is extremely slow and,
generally, unusable. 

I wonder whether there is a solution. The problem seems to become
more and more serious as the OS-s and gcc versions progress.
 
Other similar applications, like koffice, Netscape do not display
noticeable slowdown.

------- Additional Comments From Matthias Huetsch 2003-05-06 12:24 PDT
-------

After upgrading my SuSE 8.1 to SuSE 8.2, I can only confirm this
issue. And yes, it is indeed bad with even repaints delayed for more
than a minute.

Thanks for reporting this, even if it is duplicate to issue 9277.
I will add this description to issue 9277, and take care that it will
be fixed for OOo 1.1 at least.

</14104>
Comment 9 matthias.huetsch 2003-05-06 20:33:32 UTC
Having now seen the effect myself (due to issue 14104) I think this
cannot wait for OOo 2.0, but needs to be addressed for OOo 1.1 at least,
if not for OOo 1.0.4.

Retargeting to OOo 1.1 RC...
Comment 10 matthias.huetsch 2003-05-06 20:36:30 UTC
*** Issue 14104 has been marked as a duplicate of this issue. ***
Comment 11 christof.pintaske 2003-05-21 13:42:00 UTC
fixed in vcl10, review pending
Comment 12 matthias.huetsch 2003-05-28 13:43:09 UTC
Hi Christof,

Reviewed changes in vcl/unx/source/app/saldata.cxx, r1.23.28.1
-> offending call to osl_yieldThread() removed -> okay.

Tested cws_srx644_vcl10 build (unxlngi5.pro) on SuSE 8.2
-> application now responsive, even under 99% (user) CPU load -> okay.

As we evaluated some time ago, 'dtrans/source/X11/X11_selection.cxx'
uses osl_yieldThread() in two methods:
* one doubtful call in 'SelectionManager::getPasteData()'
* two redundant call in 'SelectionManager::dragDoDispatch()'

The first one should at least be evaluated, but the second one should
really be fixed. Otherwise I wouldn't regard this issue as fixed.

Matthias
Comment 13 matthias.huetsch 2003-05-28 20:09:12 UTC
dtrans/source/X11/X11_selection.cxx, r1.61.4.1 now also removes
the last calls to osl_yieldThread() -> reviewed okay -> verified.
Comment 14 christof.pintaske 2003-06-16 11:15:18 UTC
closing this one to get it out of the way for rc

cp->tom ribbens: you'll need to wait for the next OOo build to harvest
the fruits of this bug. Please feel free to reopen it if you think the
fix doesn't suit your needs.
Comment 15 sits 2003-07-24 19:01:02 UTC
So this is where all the action happened! :) I filed this as bug 7139
a while ago but I guess it slipped under the rug. Shall I mark it a
duplicate of this one?
Comment 16 wouter 2003-07-24 20:26:54 UTC
If you think it's the same bug, sure.
Comment 17 fa 2003-08-11 18:56:08 UTC
*** Issue 7139 has been marked as a duplicate of this issue. ***
Comment 18 jack.warchold 2003-09-08 12:00:08 UTC
*** Issue 18822 has been marked as a duplicate of this issue. ***