9277 – OOo reacts slow when system is under load with lower priority then OOo

Issue 9277 - OOo reacts slow when system is under load with lower priority then OOo

Summary: OOo reacts slow when system is under load with lower priority then OOo

Status:	CLOSED FIXED

Alias:	None

Product:	gsl
Classification:	Code
Component:	code (show other issues)
Version:	OOo 1.0.2
Hardware:	PC Linux, all

Importance:	P2 Trivial (vote)
Target Milestone:	OOo 1.1 RC
Assignee:	christof.pintaske
QA Contact:	issues@gsl

URL:
Keywords:

Duplicates (6):	7139 9970 10212 11204 14104 18822 (view as issue list)
Depends on:
Blocks:

Reported:	2002-11-16 10:59 UTC by kosmo123
Modified:	2003-09-08 12:00 UTC (History)
CC List:	4 users (show)

See Also:
Issue Type:	DEFECT
Latest Confirmation in:	---
Developer Difficulty:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this issue.

Description kosmo123 2002-11-16 10:59:22 UTC

Hi,

I run a distributed computing client on my linux machine (www.distributed.net) 
and when this is one OOo is very slow. Probably the problem is that OOo looks 
at the system and sees it is under 100% load and doesn't try to get so much 
CPU-power. However the distributed.net client runs at lowest priority so any 
other program will get the CPU-power first if it needs it. All other programs 
run fine with distributed.net.

Thanks in advance

Comment 1 chris 2003-03-26 11:33:45 UTC

10:38 < sethbc> haggai: i get a lot of bug reports that when a system
is under heavy load
10:38 < sethbc> haggai: openoffice runs like crap
10:38 < sethbc> haggai: today, i got a pretty interesting one
10:38 < sethbc> i'm talkin doesn't pick up keystrokes for a while
10:39 < sethbc> Gentoo is very compile based, so system load can be
high a lot of the time
10:39 < sethbc> i can replicate this, and i've seen a couple things in
IZ about it
10:39 <@haggai> oh,right.  Yes, I've seen that sort of stuff and
always assumed the machine was simply busy swapping
10:41 < sethbc> haggai: thats not it i don't think, one of my reports
mentions a schedule call
10:43 < sethbc> sched_yield
10:45 < sethbc> let me find some IZ numbers, but basically, the gist
is that sched_yield is totally different now
10:46 < sethbc> and that under heavy load, calling sched_yield will
sleep a process almost indefinately (becuase of the new O(1) scheduler)
10:47 < sethbc> O(1) is used on 2.5, and in some 2.4 implementations
(i think its mainstream, need to check)
10:49 < sethbc> so...and i know this is a HUGE change...most of the
sched_yield calls should be changed
10:50 < sethbc> http://www.codemonkey.org.uk/post-halloween-2.5.txt
10:50 < sethbc> ^^ shows the differences in the clals
10:50 < sethbc> basically, i run 2.5, and if i don't nice my compiles,
i don't write a document
10:52 < sethbc> haggai: the part where he discusses sched_yield, he
specifically mentioned openoffice .... and in this case, any press
 isn't good press =)
10:52 < sethbc> yeah, i know you can't just do a find and replace on
sched_yield, but...its a problem
11:18 < sethbc> haggai: it looks like sched_yield is defined to
pthread_yield

http://www.codemonkey.org.uk/post-halloween-2.5.txt contians this section:

Process scheduler improvements.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Another much talked about feature. Ingo Molnar reworked the process
  scheduler to use an O(1) algorithm.  In operation, you should notice
  no changes with low loads, and increased scalability with large numbers
  of processes, especially on large SMP systems.
- Robert Love wrote various utilities for changing behaviour of the
  scheduler (binding processes to CPUs etc). You can find these tools at
  http://tech9.net/rml/schedutils
- The behavior of sched_yield() changed a lot.  A task that uses
  this system call should now expect to sleep for possibly a very
  long time.  Tasks that do not really desire to give up the
  processor for a while should probably not make heavy use of this
  function.  Unfortunately, some GUI programs (like Open Office)
  do make excessive use of this call and under load their
  performance is poor.  It seems this new 2.5 behavior is optimal
  but some user-space applications may need fixing.
- The above applies to use of yield() in the kernel, too.
- 2.5 adds system calls for manipulating a task's processor
  affinity: sched_getaffinity() and sched_setaffinity()
- Regressions to mingo@redhat.com and rml@tech9.net

So it looks to me that we must find a fix for this on both the 1.0 and
1.1 branches - as this will affect more and more users as time goes on.

Comment 2 chris 2003-03-26 11:33:49 UTC

*** Issue 10212 has been marked as a duplicate of this issue. ***

Comment 3 chris 2003-03-26 11:36:13 UTC

*** Issue 9970 has been marked as a duplicate of this issue. ***

Comment 4 thorsten.martens 2003-04-07 10:13:07 UTC

TM->MHU: Can you please have a look ? JA told me, that you might be
the one, who would investigate this problem. Thanks !

Comment 5 matthias.huetsch 2003-04-07 20:01:45 UTC

Hi Thorsten, Chris,

Yes, I have been investigating into this (frequent yield() and poll()
or select() calls), and I did work with the 'gsl' team on improving
this (and will probably continue to do so).

The good news is that the current OOo 1.1 Beta build, as well as the
upcoming OOo 1.0.3 build, is already much improved in this respect.
Grep over the 'vcl' cvs archive for files committed with (Sun
internal) bug ids '#107292#' and '#104559#', in particular
'vcl/unx/source/app/saldata.cxx'.

The bad news is that there still some 'sched_yield()' calls left for
situations where 'vcl' doesn't do any other interruptable call. The
reason can been seen from a comment in 'sal/osl/unx/thread.c', where
the used function 'osl_yieldThread()' is implemented. That comment
basically states that POSIX requires a thread that does not otherwise
block (sleep, blocking I/O) needs to call 'sched_yield()'.

This might nowadays be somehow historical, but ensures that OOo can
still run with a 'user-land' / 'green' threads implementation, and
not just with 'native' threads.

Unless we are willing to raise OOo's system requirements to 'native'
threads (and this may well limit the portability) I'm not going to
simply removing those 'sched_yield()' calls.

On the other hand, I agree that there may be too many such calls,
even after the above mentioned changes. Further changes to eliminate
them may well require some changes in 'vcl' application logic.

Should someone come up with evidence to show that this logic is wrong,
I'm willing to review that evidence (and patches) and help
incorporating them into the code.

Hope that helps,
Matthias

BTW: the proper target would be 'OOo 2.0' instead of 'OOo Later', at
least IMHO.

Comment 6 matthias.huetsch 2003-04-07 20:07:11 UTC

...and the appropriate project would be 'gsl' / 'code'.

Thus changing project and target milestone...

Comment 7 mmeeks 2003-04-29 13:23:24 UTC

Grief; I just hit this, and it's amazingly bad; with a stock 1.0.3.1 -
I get > 300 sched_yields to do a simple re-display, taking ~100ms each
that's tens of seconds on RH 9.0

I've added a:

#ifndef LINUX
    sched_yield();
#endif

which sucks really; but ... Would people object to hard-coding such
platform specific knowledge of thread greenness into the mainline ?

Comment 8 matthias.huetsch 2003-05-06 20:29:56 UTC

Adding comments from duplicate issue 14104:

<14104>

Open Office has always had this defect: when the processor is loaded,
it runs slow. But with the new OS that I have installed (SuSE Linux
8.2), and with versions of OpenOffice 1.0.2 from 1.1.Beta, this
becomes unacceptable. When a program occupies 85-97% of the CPU (no
matter what  % of memory), the OpenOffice is extremely slow and,
generally, unusable. 

I wonder whether there is a solution. The problem seems to become
more and more serious as the OS-s and gcc versions progress.
 
Other similar applications, like koffice, Netscape do not display
noticeable slowdown.

------- Additional Comments From Matthias Huetsch 2003-05-06 12:24 PDT
-------

After upgrading my SuSE 8.1 to SuSE 8.2, I can only confirm this
issue. And yes, it is indeed bad with even repaints delayed for more
than a minute.

Thanks for reporting this, even if it is duplicate to issue 9277.
I will add this description to issue 9277, and take care that it will
be fixed for OOo 1.1 at least.

</14104>

Comment 9 matthias.huetsch 2003-05-06 20:33:32 UTC

Having now seen the effect myself (due to issue 14104) I think this
cannot wait for OOo 2.0, but needs to be addressed for OOo 1.1 at least,
if not for OOo 1.0.4.

Retargeting to OOo 1.1 RC...

Comment 10 matthias.huetsch 2003-05-06 20:36:30 UTC

*** Issue 14104 has been marked as a duplicate of this issue. ***

Comment 11 christof.pintaske 2003-05-21 13:42:00 UTC

fixed in vcl10, review pending

Comment 12 matthias.huetsch 2003-05-28 13:43:09 UTC

Hi Christof,

Reviewed changes in vcl/unx/source/app/saldata.cxx, r1.23.28.1
-> offending call to osl_yieldThread() removed -> okay.

Tested cws_srx644_vcl10 build (unxlngi5.pro) on SuSE 8.2
-> application now responsive, even under 99% (user) CPU load -> okay.

As we evaluated some time ago, 'dtrans/source/X11/X11_selection.cxx'
uses osl_yieldThread() in two methods:
* one doubtful call in 'SelectionManager::getPasteData()'
* two redundant call in 'SelectionManager::dragDoDispatch()'

The first one should at least be evaluated, but the second one should
really be fixed. Otherwise I wouldn't regard this issue as fixed.

Matthias

Comment 13 matthias.huetsch 2003-05-28 20:09:12 UTC

dtrans/source/X11/X11_selection.cxx, r1.61.4.1 now also removes
the last calls to osl_yieldThread() -> reviewed okay -> verified.

Comment 14 christof.pintaske 2003-06-16 11:15:18 UTC

closing this one to get it out of the way for rc

cp->tom ribbens: you'll need to wait for the next OOo build to harvest
the fruits of this bug. Please feel free to reopen it if you think the
fix doesn't suit your needs.

Comment 15 sits 2003-07-24 19:01:02 UTC

So this is where all the action happened! :) I filed this as bug 7139
a while ago but I guess it slipped under the rug. Shall I mark it a
duplicate of this one?

Comment 16 wouter 2003-07-24 20:26:54 UTC

If you think it's the same bug, sure.

Comment 17 fa 2003-08-11 18:56:08 UTC

*** Issue 7139 has been marked as a duplicate of this issue. ***

Comment 18 jack.warchold 2003-09-08 12:00:08 UTC

*** Issue 18822 has been marked as a duplicate of this issue. ***