Apache OpenOffice (AOO) Bugzilla – Issue 24914
MX servers are not responding
Last modified: 2004-08-31 17:54:27 UTC
Tested website.openoffice.org, qa.openoffice.org, openoffice.org, marketing.openoffice.org, and user-faq.openoffice.org. Noe of the MX servers for these hosts are responding. I suggest replacing or updating the MTA software.
I see for openoffice.org openoffice.org. 5M IN MX 20 openoffice.org. openoffice.org. 5M IN MX 5 asmx1.sfo.collab.net. and no MX record for the subprojects. asmx1.sfo.collab.net is responsive for me. The SMTP server at openoffice.org is not. My impression is that messages addressed to xxx@openoffice.org are delivered after some time. Messages addressed to xxx@<project>.openoffice.org can't be delivered as the SMTP Server at openoffice.org (64.125.133.202) is not responding.
adding to top5
I've filed an internal issue for the Ops and engineering team to review. I will update this issue shortly.
ops has replied The subprojects (aka subdomains of "openoffice.org") automatically get routed to the MX for the domain. Mail is being delivered properly using MX records, this is *exactly* the way internet mail and DNS is supposed to work. This is a non-issue. (We closed off openoffice.org from receiving email directly from the internet to force all email to go through the filtering MX, since we're getting such a high volume of worm/virus traffic) closing/invalid
Thanks for the update. It is sad that things still do not work. I have mail queued for the past 48 hours that cannot be transmitted because there is NO MTA to pick it up. So much for your claim.
E.g. porting: pavel@pavel:~> dig -t mx porting.openoffice.org ; <<>> DiG 9.2.2 <<>> -t mx porting.openoffice.org ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 213 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0 ;; QUESTION SECTION: ;porting.openoffice.org. IN MX ;; AUTHORITY SECTION: openoffice.org. 284 IN SOA ns1.collab.net. hostmaster.collab.net. 2004010900 3600 1800 2419200 300 ;; Query time: 12 msec ;; SERVER: 10.20.0.1#53(10.20.0.1) ;; WHEN: Thu Jan 29 21:05:32 2004 ;; MSG SIZE rcvd: 101 It returns nothing thus mails are are delivered directly to IP. and 64.125.133.202 does not listen on 25 :-( MX records are invalid, IMHO. Please show the part of the ooo zone.
-bash-2.05b$ host -t mx openoffice.org openoffice.org mail is handled by 5 asmx1.sfo.collab.net. openoffice.org mail is handled by 20 openoffice.org. Why openoffice.org does not listen on 25? This is the reason for E9185AA7C7 635 Thu Jan 29 21:14:29 Pavel@Janik.cz (connect to council.openoffice.org[64.125.133.202]: Connection timed out) doesnotexist@council.openoffice.org and similar.
You might want to take a MX wildcard record for *.openoffice.org into consideration. IIRC we had soemthing like that ealier.
Kenneths: > The subprojects (aka subdomains of "openoffice.org") automatically get routed to > the MX for the domain. What do you mean by automatically? Could you please name the mechanism that is used? Do you have domain-wide MX? *.openoffice.org? As I stated in previous comments, this is not the case. Thus there is only MX pointing to openoffice.org. The rest is routed *directly* to IP address! > Mail is being delivered properly using MX records, this > is *exactly* the way internet mail and DNS is supposed to work. This is a > non-issue. No - as you can see, mails to project.openoffice.org are not delivered to MX with the lowest prio. Why? Because there is no MX thus all mails are delivered to A of openoffice.org: pavel@pavel:~> host -t A openoffice.org openoffice.org has address 64.125.133.202 Which is not listening/it seems to firewall port 25 -> no mails at all! When I manually (via telnet ... 25) sent mail via primary MX (asmx1.sfo.collab.net.) to announce@cs.openoffice.org. It came back to me almost instantly (I;m the moderator of it)!
To be exact, all mails are delivered to project.openoffice.org which is A to the same IP as openoffice.org, ie. 64.125.133.202.
One of ten:: ********************************************** ** THIS IS A WARNING MESSAGE ONLY ** ** YOU DO NOT NEED TO RESEND YOUR MESSAGE ** **********************************************The original message was received at Thu, 29 Jan 2004 12:53:59 -0500from www.pathtech.org [205.189.41.25] ----- Transcript of session follows -----<dev@marketing.openoffice.org>... Deferred: Connection timed out with marketing.openoffice.org.<authors@user-faq.openoffice.org>... Deferred: Connection timed out with user-faq.openoffice.org.<dev@website.openoffice.org>... Deferred: Connection timed out with website.openoffice.org.Warning: message still undelivered after 4 hoursWill keep trying until message is 5 days old
as is obvious, project list mail is not being delivered. I have updated the PCN issue to reflect that. louis
update: I have entered pavel's point in the PCN issue and further updated the issue. This is going on the 3rd day of no project mail. louis
this probably more closely corresponds to PCN 25658 adding dthomas to cc list and kerry, too. louis
*** Issue 24945 has been marked as a duplicate of this issue. ***
We continue to work on this. This morning we added a wildcard MX record pointing to asmx1.sfo.collab.net. We continue to monitor the situation closely.
Mail is working its way through OOo but there is a lot of it. As the engineer on duty reported, for Saturday, "Lots of mail is getting through. Between the hours of 7am and 8am today, for example, the inbound mail exchanger for the Sun sites processed 41390 incoming messages. Of these: 31214 (75%) were Novarg worms 5264 (13%) were spam 4912 (12%) were delivered on to one of the sun sites." Saturday AM (-0800) is not a busy time. The resulting load is probably producing some erratic behavior. Please indicate in this issue such behavior. louis
Created attachment 12840 [details] DNS analysis for OOo
The statistics are interesting but there is more to the problem. DNS for OOo has errors that are preventing messages from being received. Please see the attachment.
updating the PCN issue accordingly; thanks Ger. Louis
You are certainly aware that asmx1 is not available since 4 hours...
Could you please elaborate a bit more on your statement "asmx1 is not available"?
$ telnet asmx1.sfo.collab.net smtp Trying 64.125.133.81... telnet: connect to address 64.125.133.81: Connection refused $
per our internal issue Ops has restated that the mail gateway is available and is processing: <snip>It's up and working fine, in fact it's processing mail at a rapid rate. Unfortunately for periods of time it gets too busy and rejects new connections. So, MTA's will just need to keep retrying (which they will do!) and eventually the messages will be delivered. <snip>
Ger, I presented your point to Ops at CollabNet. They appreciated the insight and data and systematically went through all the points. Mail is, at it happens, working fine--it's just that we are way overloaded, on the order of 3 million mail messages last week, 2.2 of which were clearly spam/viral. The errors you point to are not actually relevant. Thus: [quoting:] ERROR: no SOA record for openoffice.org. from tld1.ultradns.net. ERROR: no SOA record for openoffice.org. from tld2.ultradns.net. These "errors" are meaningless. The toplevel server doesn't need an SOA record for openoffice.org. All the toplevel server needs is the ns records for the openoffice domain. Our nameservers, ns{1,2,3}.collab.net have SOA records in place. A missing SOA record for the TLD server could not have any possible effect on email delivery. For example, kernel.org also "fails" this test. The last error message looks valid, but is in fact wrong: ERROR: NS list from openoffice.org. authoritative servers does not === match NS list from parent (org.) servers If this were indeed true, it would be a misconfiguration, but it is untrue. Doing the following queries easily proves this: dig @TLD1.ULTRADNS.NET openoffice.org ns dig @TLD2.ULTRADNS.NET openoffice.org ns dig @ns1.collab.net. openoffice.org ns dig @ns2.collab.net. openoffice.org ns dig @ns3.collab.net. openoffice.org ns End quote. So, the fact of the matter is that things are working--but b/c of the tsunami of wormy crap coming from out there, the system is just slow in delivering mail. thanks for collaborating! louis
To give insight into how heavily the system is being taxed: Quote: FYI, on the outgoing side, here's stats through 10:30AM from openoffice. As you can see, plenty of mail going out! Total delivery attempts: 277806 Accepted by destination: 176531 (63.5%) Deferred by destination: 67223 (24.2%) Failures: 23404 (8.4%) Double bounces: 10646 (3.8%) Triple bounces: 2 (0.0%) Meanwhile, on the ingoing, we've been hit by several million/week, most of which (>95%) are spam/ viruses. Expect delays. Mail however is going through and the system is correctly configured. louis
Outgoing is NOT my concern. Inbound is. I have posted messages to user-faq starting Saturday that have yet to arrive and be posted. This concerns me a lot. THat is why I started to look at DNS and the MX records and how they were handled. When I saw errors I added to the issue. I received your personal email with the update of this issue and replied. I'll bet that because out-going is working well you will see this first.
hi actually, I received both about the same time. Ger, please check your own ISP. The blockage could be there, esp. for mail sent Saturday. Mail is getting through, though, as I emphasized before, not as smoothly or swiftly as desired. louis
For the record, I act as my own ISP. I have my own portable Class "C" that my upline routes for me. Other than paying once a month that the only involvement of an outside service. My MTA connects directly and thus I know that the OOo mail server accepted the messages but that's as far as I could go with troubleshooting my own message submissions. Still haven't shown up in users-faq. Maybe time to try again ;-)
*** Issue 25375 has been marked as a duplicate of this issue. ***
Now that 25375 is closed as a duplicate of this - whats the status of having the problems fixed so mailing lists atcually work? And why didn't the hardware upgrade give any noticeable performance improvements?
The servers are still under a heavy load related to the very high traffic related to the worm. I'll update this issue with some statistics shortly.
But load statistics will only tell me why things are broken, not how they are going to get better 8-(
Since unwanted traffic seems to be a major problem, is there any solid technical reason that http://qmail-scanner.sourceforge.net/ cannot be implemented?
Maybe. But one would need to know where the load is coming from
if there is a high load due to worm activity, how come the announce@moderation queue - the only part of mail system that appears to work as expected - only receives 20 mails per hour?
I tried to send an e-mail yesterday and today to discussie@nl.openoffice.org, but it's not coming through (also not in the archives). Are e-mails getting lost, or are they just held up in some queue?
> I tried to send an e-mail yesterday and today to discussie@nl.openoffice.org, The second mail arrived, more than an hour late. I suppose the first one has been lost...
simonbr: what time did you send the mail, what was the subject line and what host was the message sent from? I'll relay that information to the Ops Engineers to investigate.
@kenneth, Both mails, Subject: "Open Office en OpenOffice.org", sent to discussie@nl.openoffice.org via smtp.xs4all.nl, from: simon.oo.o@xs4all.nl First mail (lost) sent at 10 feb 2004, 22:42 +0100 Second mail (received) sent at 11 feb 2004, 19:52 +0100
More evidence: To: OOo_marketing list <dev@marketing.openoffice.org> Subject: Re: [Marketing] Re: [website-dev] Accessibility of Openoffice Date: Wed, 11 Feb 2004 13:08:32 -0500 To: OOo_marketing list <dev@marketing.openoffice.org> Cc: Louis Suarez-Potts <louis@openoffice.org>, Jacqueline McNally <openoffice.org@decisions-and-designs.com.au> Subject: [Fwd: Accessibility of Openoffice] Date: Tue, 10 Feb 2004 08:59:43 -0500 To: OOo_marketing list <dev@marketing.openoffice.org> Cc: OOo_website mailing list <dev@website.openoffice.org> Subject: Accessibility of Openoffice Date: Tue, 10 Feb 2004 00:37:24 -0500
as a way of conducting project communications, the mailing lists simply don't work right now. I have 10 mails now which have received 'cannot deliver for 4 hours' notices today and have not made to the lists.
Some of subjects / times: subject: accessibility of openoffice.org time: 13:54 +0000 subject: Re: [discuss] Triple O Reader time: 14:39 +0000 subject: ping time: 15:39 +0000 subject: insert random topic here time: 15:51 +0000 subject: 1.1.1 and marketing time: 16:14 +0000
I needed the host information as well please.
the host would one of nwkea-mail-{1,2,3,4}.sun.com or brmea-mail-{1,2,3,4}.sun.com for the listed mails, in that order: brmea-mail-3.sun.com nwkea-mail-1.sun.com brmea-mail-3.sun.com brmea-mail-4.sun.com brmea-mail-4.sun.com In general, sun management and ITops tend to frown upon publishing names of internal machines.
The status right now is: openoffice.org. 300 IN MX 5 asmx1.sfo.collab.net. openoffice.org. 300 IN MX 20 openoffice.org. Two MX records. One of them (the preferred one) is asmx1, other is openoffice.org which is the same as www.openoffice.org. I propose another solution: - remove openoffice.org from MX records - put more machines as MX with the same priority to get more rotation - those machines, together with current asmx1 will only do prefiltering for viruses, bounces and such and the rest will be relayed to openoffice.org which will accept incoming SMTP only from those systems. This has much better scaling - you can add more and more machines and can overcome SC's badly designed system that has to (according to available informations) run on only one single machine. Really?
As stated elsewhere, the extraordinary load of email is being filtered and is making it through- albeit slowly. Any bounced messages can be resent and sit in the queue. I'm going to close out this issue as it's morphed from it's original problem which is being dealt with, followed and described in detail in another issue. Any suggestions on replacing hardware or the MTA should be sent to st & mh who will forward their recommendations to our Sun contacts. Other failed message delivery notices can go in the previously filed issues as well.
This issue is not about replacing anything. That was a suggestion as existing software was failing badly. Close this as invalid is silly. If you want to close it close as resolved if, indeed, you feel it is resolve.
closing as resolved
Good. And thanks
Closing this issue.