Issue 26595 - PrinterInfoManager/SystemQueueInfo race condition
Summary: PrinterInfoManager/SystemQueueInfo race condition
Status: CLOSED FIXED
Alias: None
Product: gsl
Classification: Code
Component: code (show other issues)
Version: OOo 1.1.1RC
Hardware: Mac Mac OS X, all
: P2 Trivial (vote)
Target Milestone: OOo 1.1.2
Assignee: fa
QA Contact: issues@gsl
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-03-17 04:48 UTC by fa
Modified: 2006-04-04 20:59 UTC (History)
2 users (show)

See Also:
Issue Type: PATCH
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
patch to make race condition much less likely (1.27 KB, patch)
2004-03-17 04:48 UTC, fa
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this issue.
Description fa 2004-03-17 04:48:05 UTC
In debugging the Mac OS X printer issues, there appears to be a race condition in 
printerinfomanager.cxx.

PrinterInfoManager object:
1) c'tor creates the SystemQueueInfo object/thread
       a) SystemQueueInfo object calls Thread::create(), which then creates the thread and schedules run()
2) c'tor calls initialize()
3) initialize() does lots of stuff
4) initialize() calls getSystemPrintQueues() and retrieves what hopefully are the results of 
SystemQueueInfo::run()

Somewhere between 1 and 4, run() should execute.  However, there is no way to prevent 
PrinterInfoManager from knowing whether run() has finished or not.  Theoretically, PrinterInfoManager 
functions should block on run() completing successfully before these functions attempt to access its 
data.  I am observing behavior where SystemQueueInfo::run() does not complete its run by the time 
step (4) comes around.  Therefore, there are no print queues returned to PrinterInfoManager object and 
it does not report any queues to OOo the first run through.  I am also observing behavior where when 
the /etc/cups/printers.conf watch file is changed, that the PrinterInfoManager object is too quickly 
calling getSystemPrintQueues() in initialize() and hasChanged still != 0.

The remedy seems to be this patch, although it still doesn't account for the lag time between when 
create() is called and when run() actually is called, which may or may not cause the race condition to 
still occur.  An ideal situation would be one where the PrinterInfoManager has a callback of sorts that 
updates its printer list, rather than having the PrinterInfoManager keep checking SystemQueueInfo.  ie, 
SQI notifies the PIM it has complete data.  This would solve the possible problem of the system queue 
info querying (trying all print commands for system queues) taking a longer time than the PIM can cope 
with as well.
Comment 1 fa 2004-03-17 04:48:46 UTC
Created attachment 13863 [details]
patch to make race condition much less likely
Comment 2 fa 2004-03-17 04:51:36 UTC
cp/pl: are there any problems with moving the Mutex up in the code?  Before the patch, the Mutex was 
quite localized and only protected the actual data that is changed.  With the patch, the whole routine is 
of course inside the Mutex.  The desired effect here is to ensure that ->hasChanged() does not return 
until any pending run() is complete...
Comment 3 fa 2004-03-17 04:53:18 UTC
This bug pretty much blocks OS X release of 1.1.1 since no system print queues show up otherwise.
Comment 4 philipp.lohmann 2004-03-17 09:04:24 UTC
I think we shouldn't apply this as the whole point of the thread is that
hasChanged may return befor it has finished. The problem being solved here is
that the thread may never finish due to a hanging lpc command; if that would
always finish the thread wouldn't be needed at all.
Comment 5 philipp.lohmann 2004-03-17 10:40:37 UTC
How do no system queues show up ? Every time a frame (e.g. dialogue, document)
gets the focus, PrinterInfoManager::checkPrintersChanged should get called and
if printers are available by then, then the application is told so. Is probably
something amiss so that checkPrintersChanged does not really get called ?
Comment 6 christof.pintaske 2004-03-17 10:47:13 UTC
cp->fa: please check as Philipp suggested, the thread is not required to
terminate early but essentially only befor you actually try to print. Is the
thread still running when you try to do so ? I wouldn't mind if you change the
code for Mac only but "hanging print queues" _is_ an issue so that I cannot
accept to move the mutex in general.
Comment 7 pavel 2004-03-17 18:37:25 UTC
according to IRC agreement, retarget to 1.1.2 to get proper fix and not hold
1.1.1 release for main platforms. Final patch will be applied manually to MacOS
X build.
Comment 8 khendricks 2004-03-30 19:21:04 UTC
Hi Philipp,

I am trying to figure this out as well.  I never see the problem that Dan describes and I am worried that 
if we apply tha patch, some users may see a long hang or startup delay.

I noticed you said that the race is not improtant since every focus in on a new frame updates the 
information.

When I looked at that code I see the following:

            if( GetSalData()->pFirstInstance_->maInstData.mbPrinterInit )
                vcl_sal::PrinterUpdate::update();

So mbPrinterInit must be true for this to happen.  Since that value is initialized to false, it must be first 
set to true.

I see this happening in:

CreateInfoPrinter
CreatePrinter
GetPrinterQueueInfo
GetPrinterQueueState
GetDefaultPrinter.

My question is this, on a clean start after a fresh install (with no use of spadmin) which of these 
routines gets called first.  Could the race be that when the File->Printer Settings focus is done, that 
mbPrinterInit is not true yet (no jobs have printed yet, no timers have expired yet).

Since mbPrinterInit is never changed after that, perhaps we should simply be enabling update() 
constantly for any new frame focus in and checking the mbPrinterInit value only at the last possible 
moment someplace later in the code.

Kevin

  
Comment 9 philipp.lohmann 2004-03-31 08:39:27 UTC
CreateInfoPrinter and CreatePrinter are called whenever an application creates a
Printer object; writer does this per default on startup, some others do also but
e.g. impress does not AFAIK. When an application wants to list the existing
printers (e.g. for the printer settings dialog) GetPrinterQueueInfo gets called;
if you startup writer it is extremely unlikely that you could fire up the
printer dialog before the document gets the focus - unless focus events are
somehow not propagated.

An fprintf in vcl_sal::PrinterUpdate::update() should clarify things a bit i
think; also a breakpoint might show whether it gets called and by whom - if gdb
can be trusted to actually stop which i must sadly say is not always the case at
least on x86.
Comment 10 Martin Hollmichel 2004-04-30 14:18:14 UTC
set platform.
Comment 11 khendricks 2004-05-06 19:01:09 UTC
Hi,

Adding myself to CC on this one since this patch holds up the Panther printing patch which depdends 
on it.

Perhaps we can simply ifdef MACOSX here so that other paltforms are not affected since empirically the 
patch that Dan created is needed for proper working of MacOSX.

Ideas?

Kevin

Comment 12 khendricks 2004-05-09 16:58:08 UTC
Hi Dan,

I ifdef'd your MutextGuard change for just MacOSX so that no other platforms would be affected and 
then added a FIXME comment to see why this is not needed for other platforms.

I integrated your race fix into the Panther printing patches which were tested and found not to impact 
Jaguar users.

So resolving this as fixed but we should remember to revisit this to figure out why this was needed for 
just us.

Thanks,

Kevin
Comment 13 Martin Hollmichel 2006-04-04 20:59:15 UTC
close issue