Problems with high loads and SPAM being sent - I think

40 posts / 0 new
Last post
#1 Fri, 12/07/2012 - 16:30
nothingless

Problems with high loads and SPAM being sent - I think

So, I have a fairly new server with VirtualMin, everything running fine until a few days ago. The server loads have gone insane, and there's always loads of postfix processes running, hundreds sometimes. I've also gotten some bounced emails that were being sent from addresses on my server that don't exist - hence me thinking this is maybe SPAM related.

I ran uptime:

 23:10:02 up  7:00,  1 user,  load average: 183.23, 180.38, 166.24

mailq | tail -1 -- 512 Kbytes in 168 Requests.

top (a few minutes ago there was a lot more postfix stuff in it)

11357 root      20   0 15468 1664  896 R  0.3  0.0   0:00.24 top                                                                              
    1 root      20   0 19272 1476 1192 S  0.0  0.0   0:01.62 init                                                                             
    2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd                                                                         
    3 root      20   0     0    0    0 S  0.0  0.0   0:00.69 ksoftirqd/0                                                                      
    4 root      20   0     0    0    0 S  0.0  0.0   0:00.36 kworker/0:0                                                                      
    5 root      20   0     0    0    0 S  0.0  0.0   0:00.02 kworker/u:0                                                                      
    6 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/0                                                                      
    7 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/1                                                                      
    9 root      20   0     0    0    0 S  0.0  0.0   0:00.76 ksoftirqd/1                                                                      
   11 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/2                                                                      
   13 root      20   0     0    0    0 S  0.0  0.0   0:00.86 ksoftirqd/2                                                                      
   14 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/3                                                                      
   15 root      20   0     0    0    0 S  0.0  0.0   0:00.24 kworker/3:0                                                                      
   16 root      20   0     0    0    0 S  0.0  0.0   0:00.62 ksoftirqd/3                                                                      
   17 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/4                                                                      
   18 root      20   0     0    0    0 S  0.0  0.0   0:00.07 kworker/4:0                                                                      
   19 root      20   0     0    0    0 S  0.0  0.0   0:00.10 ksoftirqd/4                                                                      
   20 root      RT   0     0    0    0 S  0.0  0.0  95:48.99 migration/5                                                                      
   21 root      20   0     0    0    0 S  0.0  0.0   0:00.10 kworker/5:0                                                                      
   22 root      20   0     0    0    0 S  0.0  0.0   0:00.30 ksoftirqd/5                                                                      
   23 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/6                                                                      
   25 root      20   0     0    0    0 S  0.0  0.0   0:00.16 ksoftirqd/6                                                                      
   26 root      RT   0     0    0    0 S  0.0  0.0 183:33.31 migration/7                                                                      
   28 root      20   0     0    0    0 S  0.0  0.0   0:00.10 ksoftirqd/7                                                                      
   29 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 cpuset                                                                           
   30 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 khelper                                                                          
   31 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kdevtmpfs                                                                        
   32 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 netns                                                                            
  425 root      20   0     0    0    0 S  0.0  0.0   0:00.02 sync_supers                                                                      
  427 root      20   0     0    0    0 S  0.0  0.0   0:00.00 bdi-default                                                                      
  428 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 kintegrityd                                                                      
  430 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 kblockd                                                                          
  514 postfix   20   0 79484 4184 2864 D  0.0  0.0   0:00.00 cleanup                                                                          
  527 postfix   20   0 96392 5372 3804 S  0.0  0.0   0:00.00 smtpd     

Any idea on how I can further troubleshoot and fix this issue? Some help would be much appreciated! The server runs CentOS 6 with 32GB of RAM.

Fri, 12/07/2012 - 16:32
nothingless

System Information > Running Processes: 653

see here: http://pastie.org/5496096

Fri, 12/07/2012 - 16:33
nothingless

I tried turning on DKIM but that failed with: Failed to save DKIM settings : Failed to lock file /etc/postfix/main.cf after 5 minutes. Last error was :

Not sure it would have made a difference though!

Fri, 12/07/2012 - 18:00
andreychek

Howdy,

What output do you receive if you run the command "mailq | tail -1"?

I suspect that'll show a high number of email in your mail queue... if so, you'd need to figure out what is generating the spam.

It's likely either a legitimate user who's desktop got a virus, or someone who broke into a web app on your server and is using it to send spam.

You can figure all that out from the message headers of the email in your mail queue... you can view that in Webmin -> Servers -> Postfix -> Mail Queue.

-Eric

Fri, 12/07/2012 - 18:13
Locutus

@Eric: He did run mailq in his first post, and got 168 mails in the queue. :)

There's a great number of php-cgi processes running as user "disneysc", so that might be the one being flooded. I'd start checking any sites that run as that user. A good software to scan for web-based malware is "Linux Malware Detect".

Also I'd stop Postfix right away until you figure this out, before your server gets put onto too many blacklists for spamming.

Fri, 12/07/2012 - 18:43
nothingless

Hello! Thanks for the replies, guys!

Yes, I ran the mailq command, it's showing this at the moment:

mailq | tail -1
-- 596 Kbytes in 225 Requests.

I think it's probably important to note that this is a personal server, it only has around 20 of my own personal websites on it. No other users have access. Most of the websites have one info@ forwarding address setup, and one domain uses a single POP3 account, and that's all. On average, I get maybe 10 emails daily from all these sites combined, so even though 225 maybe doesn't seem like a lot, it's far more than I estimate it should be!

The disneysc user is actually the largest site on the server - it runs on Wordpress and gets over 1 million page views a month. I use caching, including APC and plugins for WP, but I thought the number of php-cgi processes were probably just due to the site being busy?

I'm trying to get into the Mail Queue to look at the headers but the load is so high it's not loading for me:

CPU load averages       246.73 (1 min) 245.45 (5 mins) 242.56 (15 mins)

Crazy, right??

Running processes shows:

CPU load averages:   247.10 (1 mins) , 246.31 (5 mins) , 243.48 (15 mins)
CPU type:   Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz , 8 cores
 
ID      Owner       CPU     Command   
20  root    90.0 %  [migration/5]
26  root    32.7 %  [migration/7]
3244    mysql   0.6 %   /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-e ...

and everything else at 0%, around 600 or so again.

I've managed to stop PostFix via the VirtualMin interface, but still no luck with the mail queue. Anything else I can do, or some other way I can look at the Queue? VirtualMin itself seems responsive enough, it's just the Webmin -> Servers -> Postfix page that won't load, so I can't click on through to Queue. :/

Fri, 12/07/2012 - 22:24
andreychek

Howdy,

Sorry, missed your mailq output in your original post.

It's possible that Postfix not running is preventing Virtualmin from being able to display the Postfix screen... but you can accomplish all that from the command line too.

If you run "mailq", in that output, if it's a spam problem, you'll likely see a recurring user over and over in there.

If you can figure out which user is recurring (usually as the sender), take note of one of the message queue IDs associated with an email being sent from them.

Then, run:

postcat -q MESSAGE_QUEUE_ID

Scroll down to where the message headers show up -- and there you should be able to get some insight into how the email is being generated.

If you'd like a hand interpreting the email headers, feel free to post them here.

-Eric

Sat, 12/08/2012 - 04:58
nothingless

Hmm, strangely enough the load on my server has NOT gone down even though I stopped Postfix!

uptime
 11:06:26 up 18:56,  1 user,  load average: 249.00, 249.03, 249.05

Maybe I'm looking in the wrong area? I thought the load would go down with Postfix turned off. There are still a ton of Postfix processes listed under Running Processes - any idea how I can stop/kill all of those?

I ran mailq:

postqueue: warning: Mail system is down -- accessing queue directly
-Queue ID- --Size-- ----Arrival Time---- -Sender/Recipient-------
EEC10149A*      834 Fri Dec  7 20:31:34  CDA4A3D1@veneilijat.net
                                             (bounce or trace service failure)
                                         BOUNCE@dedi-fr-57196.op-net.com
 
B5C1617B1*     1097 Fri Dec  7 22:31:06  double-bounce@dedi-fr-57196.op-net.com
                                         postmaster@dedi-fr-57196.op-net.com
 
F299617A1*      868 Fri Dec  7 22:18:29  double-bounce@dedi-fr-57196.op-net.com
                                         postmaster@dedi-fr-57196.op-net.com
 
BF8F717A8*     1109 Fri Dec  7 22:20:33  double-bounce@dedi-fr-57196.op-net.com
                                         postmaster@dedi-fr-57196.op-net.com
 
3F3DA17AE*      852 Fri Dec  7 22:26:28  688DFBD@johanvangilsfd.nl
                                         BOUNCE@dedi-fr-57196.op-net.com
 
447E817AA*     1093 Fri Dec  7 22:24:35  double-bounce@dedi-fr-57196.op-net.com
                                         postmaster@dedi-fr-57196.op-net.com
 
224A6148B*      866 Fri Dec  7 21:06:12  1629EE3@theknightsofavalon.com
                                         BOUNCE@dedi-fr-57196.op-net.com
 
29E4B15DB*     1094 Fri Dec  7 21:50:25  double-bounce@dedi-fr-57196.op-net.com
                                         postmaster@dedi-fr-57196.op-net.com
 
20342146E*     2946 Fri Dec  7 20:53:21  JaniahLopaz@pacific.net.sg
                                         BOUNCE@dedi-fr-57196.op-net.com
 
63B8117AD*     1102 Fri Dec  7 22:26:30  double-bounce@dedi-fr-57196.op-net.com
                                         postmaster@dedi-fr-57196.op-net.com
 
B6C16152B*     4011 Fri Dec  7 21:35:44  AlaynaThurstonson@dwvideoproductions.com
                                         BOUNCE@dedi-fr-57196.op-net.com
 
54BAC12AF*      882 Fri Dec  7 20:41:25  04203558A@poncedeleongroup.com
                                         BOUNCE@dedi-fr-57196.op-net.com
 
82DE71490*     1112 Fri Dec  7 21:07:21  alcaldesalteras@dipusevilla.es
                                         BOUNCE@dedi-fr-57196.op-net.com
 
EE9281532*     2697 Fri Dec  7 21:37:33  LaneyBarbaro@orange.fr
                                         BOUNCE@dedi-fr-57196.op-net.com
 
1FF78160C*     1100 Fri Dec  7 22:02:38  double-bounce@dedi-fr-57196.op-net.com
                                         postmaster@dedi-fr-57196.op-net.com
 
F296717BE*     1109 Fri Dec  7 22:40:33  double-bounce@dedi-fr-57196.op-net.com
                                         postmaster@dedi-fr-57196.op-net.com

There is not a single email address in there I recognize - the BOUNCE and postmaster addresses are standard ones, right? Why are there so many emails going from one to the other? Would that indicate the postmaster address is being used to send SPAM? dedi-fr-57196.op-net.com is my host name.

I looked up one of the bounce-postmaster emails:

*** ENVELOPE RECORDS active/B5C1617B1 ***
message_size:            1097             256               1               0            1097
message_arrival_time: Fri Dec  7 23:31:06 2012
create_time: Fri Dec  7 23:31:06 2012
named_attribute: log_message_origin=local
named_attribute: trace_flags=0
sender: double-bounce@dedi-fr-57196.op-net.com
original_recipient: postmaster
recipient: postmaster@dedi-fr-57196.op-net.com
*** MESSAGE CONTENTS active/B5C1617B1 ***
Received: by dedi-fr-57196.op-net.com (Postfix)
        id B5C1617B1; Fri,  7 Dec 2012 23:31:06 +0100 (CET)
Date: Fri,  7 Dec 2012 23:31:06 +0100 (CET)
From: MAILER-DAEMON@dedi-fr-57196.op-net.com (Mail Delivery System)
To: postmaster@dedi-fr-57196.op-net.com (Postmaster)
Subject: Postfix SMTP server: errors from unknown[95.86.0.88]
Message-Id: <20121207223106.B5C1617B1@dedi-fr-57196.op-net.com>
 
Transcript of session follows.
 
 Out: 220 dedi-fr-57196.op-net.com ESMTP Postfix
 In:  EHLO [95.86.0.88]
 Out: 250-dedi-fr-57196.op-net.com
 Out: 250-PIPELINING
 Out: 250-SIZE 10240000
 Out: 250-VRFY
 Out: 250-ETRN
 Out: 250-STARTTLS
 Out: 250-AUTH PLAIN LOGIN
 Out: 250-AUTH=PLAIN LOGIN
 Out: 250-ENHANCEDSTATUSCODES
 Out: 250-8BITMIME
 Out: 250 DSN
 In:  MAIL FROM:<D79F81FF@hollandwoningen.nl>
 Out: 250 2.1.0 Ok
 In:  RCPT TO:<celina@anabeatrizbarrosfan.com>
 Out: 250 2.1.5 Ok
 In:  DATA
 Out: 354 End data with <CR><LF>.<CR><LF>
 Out: 451 4.3.0 Error: queue file write error
 
Session aborted, reason: lost connection
 
For other details, see the local mail logfile
*** HEADER EXTRACTED active/B5C1617B1 ***
*** MESSAGE FILE END active/B5C1617B1 ***

At this point I decided to reboot my server, to clear the hundreds of postfix processes that weren't going away, and ran mailq again once it was rebooted:

postqueue: warning: Mail system is down -- accessing queue directly
-Queue ID- --Size-- ----Arrival Time---- -Sender/Recipient-------
97C3F16D0*     2865 Tue Dec  4 21:26:42  MAILER-DAEMON
                                         080926FE@sekelaam.endjunk.com
 
9362B15CE*     4172 Wed Dec  5 14:52:55  MAILER-DAEMON
                                         SusanBoshers@stadia.ch
 
381AC16D9      2861 Thu Dec  6 11:42:05  MAILER-DAEMON
(Host or domain name not found. Name service error for name=eurolatincz.com type=MX: Host not found, try again)
                                         14640506@eurolatincz.com
 
3A2261781      3711 Thu Dec  6 23:00:55  MAILER-DAEMON
     (connect to ftp3.scmedia.com.hk[203.184.143.49]:25: Connection timed out)
                                         rivierar@ftp3.scmedia.com.hk
 
303461584      2868 Wed Dec  5 00:54:07  MAILER-DAEMON
     (connect to cheapsnowmobile.com[98.129.126.138]:25: Connection timed out)
                                         CAE06F11@cheapsnowmobile.com
 
3C583176E      3620 Thu Dec  6 15:33:39  MAILER-DAEMON
(Host or domain name not found. Name service error for name=packbell.net type=MX: Host not found, try again)
                                         predicatekvvh2@packbell.net
 
335CD16D2      2840 Wed Dec  5 15:11:55  MAILER-DAEMON
(Host or domain name not found. Name service error for name=casteando.com type=MX: Host not found, try again)
                                         7F6F640@casteando.com
 
3AF241608      3654 Tue Dec  4 21:30:04  MAILER-DAEMON
       (connect to sneogg2.araneo.pl[195.178.114.15]:25: Connection timed out)
                                         bim@sneogg2.araneo.pl
 
330D817CA      3842 Thu Dec  6 23:40:51  MAILER-DAEMON
(connect to h-195-178-186-197.na.cust.bahnhof.se[195.178.186.197]:25: Connection refused)
                                         gillettetinytimg1@h-195-178-186-197.na.cust.bahnhof.se
 
B3864174E      3690 Thu Dec  6 20:59:00  MAILER-DAEMON
      (connect to www.reifengundlach.de[213.83.36.114]:25: Connection refused)
                                         jakunx@www.reifengundlach.de
 
B3F4516CC      3672 Thu Dec  6 19:05:22  MAILER-DAEMON
(host mx4.netstrefa.pl[217.149.243.156] said: 451 Temporary local problem - please try later (in reply to RCPT TO command))
                                         encirclingvoyq6@mx4.netstrefa.pl
 
B55BC1629      3737 Tue Dec  4 19:31:53  MAILER-DAEMON
(connect to da.e.78ae.static.theplanet.com[174.120.14.218]:25: Connection refused)
                                         fonseca@da.e.78ae.static.theplanet.com
 
B6748157F      2884 Wed Dec  5 00:53:58  MAILER-DAEMON
      (connect to owamail4.westerntc.edu[165.128.0.33]:25: Connection refused)
                                         32EBEFF4@owamail4.westerntc.edu
 
BA0A21595      6414 Tue Dec  4 16:05:29  MAILER-DAEMON
(Host or domain name not found. Name service error for name=gardenvillage.com type=MX: Host not found, try again)
                                         E8B1ABD@gardenvillage.com
 
B6FDE168D      3735 Tue Dec  4 23:49:13  MAILER-DAEMON
(connect to da.e.78ae.static.theplanet.com[174.120.14.218]:25: Connection refused)
                                         sure110@da.e.78ae.static.theplanet.com
 
02CA81614      2835 Thu Dec  6 15:21:36  MAILER-DAEMON
(Host or domain name not found. Name service error for name=llccorp.net type=MX: Host not found, try again)
                                         D4C4B190E@llccorp.net
 
084DE364       2831 Sat Dec  8 10:37:32  MAILER-DAEMON
      (connect to mail.veneilijat.net[81.19.114.200]:25: Connection timed out)
                                         CDA4A3D1@veneilijat.net
 
7EDC51610      2863 Thu Dec  6 17:01:03  MAILER-DAEMON
(Host or domain name not found. Name service error for name=anjcs.com type=MX: Host not found, try again)
                                         BF22B9E4@anjcs.com
 
D326F1646      2894 Wed Dec  5 00:50:24  MAILER-DAEMON
       (connect to yoolimgemstones.com[141.8.225.13]:25: Connection timed out)
                                         F3C8CF9@yoolimgemstones.com
 
D7CF1168A      2837 Wed Dec  5 01:01:18  MAILER-DAEMON
       (connect to mail.serviper.com[201.144.86.162]:25: Connection timed out)
                                         94789D6EE@serviper.com
 
DB9EB165E      2826 Wed Dec  5 00:58:45  MAILER-DAEMON
             (connect to annashouse.net[208.87.35.103]:25: Connection refused)
                                         8056D75DD@annashouse.net
 
2268D15B9      3625 Tue Dec  4 22:59:38  MAILER-DAEMON
          (connect to mx.silentpro.de[141.8.224.137]:25: Connection timed out)
                                         joness@mx.silentpro.de
 
2CC981785      3235 Fri Dec  7 06:02:20  MAILER-DAEMON
(host mx5.mail.yahoo.co.jp[183.79.57.236] refused to talk to me: 553 Mail from 37.59.52.173 not allowed - VS98-IP0 deferred - see http://help.yahoo.co.jp/help/jp/mail/anti-spam/anti-spam-24.html)
                                         watarulyoukoq7@yahoo.co.jp
 
2A89D1743      3663 Thu Dec  6 20:12:51  MAILER-DAEMON
       (connect to faye.localdns.com[119.110.108.55]:25: Connection timed out)
                                         eames@faye.localdns.com
 
2117315BD      2892 Tue Dec  4 23:00:46  MAILER-DAEMON
         (connect to la.consulting.com[82.98.86.161]:25: Connection timed out)
                                         D1E8775DC@la.consulting.com
 
22B7B1613      3724 Tue Dec  4 23:02:32  MAILER-DAEMON
(connect to weblinux3.gtdinternet.com[201.238.246.20]:25: Connection timed out)
                                         bimetallistic@weblinux3.gtdinternet.com
 
2EE3517B2      2905 Fri Dec  7 10:21:00  MAILER-DAEMON
     (connect to snyfkdvmuwpuln.dleh.com[98.124.198.1]:25: Connection refused)
                                         022BF97F0@snyfkdvmuwpuln.dleh.com
 
432931686      3706 Fri Dec  7 01:25:34  MAILER-DAEMON
(connect to gic-web-bsd-010.genotec.ch[82.195.224.110]:25: Connection timed out)
                                         ajdecatur@gic-web-bsd-010.genotec.ch
 
4A5421705      3700 Thu Dec  6 10:52:02  MAILER-DAEMON
      (connect to vz231.worldserver.net[80.81.243.131]:25: Connection refused)
                                         s1954@vz231.worldserver.net
 
42AC61672      3732 Tue Dec  4 21:32:23  MAILER-DAEMON
(host eagle135.startdedicated.com[69.64.34.106] said: 451 Temporary local problem - please try later (in reply to RCPT TO command))
                                         ables@eagle135.startdedicated.com
 
400201470      3866 Fri Dec  7 17:10:40  MAILER-DAEMON
             (connect to mail.centraltx.us[63.96.10.3]:25: Connection refused)
                                         SantiagoBoekhout@centraltx.us
 
4FACD1671      2804 Tue Dec  4 12:45:18  MAILER-DAEMON
(Host or domain name not found. Name service error for name=datwm.com type=MX: Host not found, try again)
                                         7650119@datwm.com
 
C75D21582      2854 Wed Dec  5 18:38:13  MAILER-DAEMON
(Host or domain name not found. Name service error for name=carbonbasket.com type=MX: Host not found, try again)
                                         003DE47AD@carbonbasket.com
 
C9CEC1492      2992 Fri Dec  7 19:38:29  MAILER-DAEMON
(connect to mx.kth.se[2001:6b0:1:1300:20e:7fff:fe26:4fe1]:25: No route to host)
                                         albynn@neutron.kth.se
 
CD8E41607      2847 Wed Dec  5 09:09:23  MAILER-DAEMON
   (connect to barleyandhopheads.com[98.129.229.195]:25: Connection timed out)
                                         C1172C51@barleyandhopheads.com
 
149ED16D5      2824 Fri Dec  7 05:20:49  MAILER-DAEMON
               (connect to livraria.pt[82.98.86.173]:25: Connection timed out)
                                         6F547D3@livraria.pt
 
11650114D      7720 Fri Dec  7 20:22:31  MAILER-DAEMON
(host mx1.emailsrvr.com[98.129.184.131] said: 450 4.7.1 <no_reply-PX@durham.com>: Relay access unavailable. (in reply to RCPT TO command))
                                         no_reply-PX@durham.com
 
1B9D5164B      2837 Tue Dec  4 12:45:16  MAILER-DAEMON
    (connect to mx1.zonadeforos.com.ar[190.228.30.222]:25: Connection refused)
                                         77F355A@zonadeforos.com.ar
 
1444B168E      3630 Tue Dec  4 20:16:14  MAILER-DAEMON
               (connect to canada.com[199.71.40.135]:25: Connection timed out)
                                         billzheng@canada.com
 
14D42158F      2873 Wed Dec  5 00:54:16  MAILER-DAEMON
(Host or domain name not found. Name service error for name=kavcolombia.com type=MX: Host not found, try again)
                                         670299A6A@kavcolombia.com
 
94F1616EF      3774 Wed Dec  5 16:13:18  MAILER-DAEMON
         (connect to mx1.securebank.com[10.42.23.11]:25: Connection timed out)
                                         message@securebank.com
 
94EA61746      2875 Wed Dec  5 18:46:26  MAILER-DAEMON
              (connect to cuxycoons.de[141.8.224.70]:25: Connection timed out)
                                         55F3330E4@cuxycoons.de
 
F31F2158D      3495 Wed Dec  5 12:26:14  MAILER-DAEMON
(host mx1.mail.yahoo.co.jp[183.79.29.234] refused to talk to me: 553 Mail from 37.59.52.173 not allowed - VS98-IP0 deferred - see http://help.yahoo.co.jp/help/jp/mail/anti-spam/anti-spam-24.html)
                                         gHightowerqday4Catherine@yahoo.co.jp
 
F091E1647      2876 Thu Dec  6 09:14:12  MAILER-DAEMON
(host mail.budgetawards.com[209.25.131.52] said: 451 qq write error or disk full (#4.3.0) (in reply to end of DATA command))
                                         C81E866@budgetawards.com
 
E80B716CB      3681 Thu Dec  6 15:58:43  MAILER-DAEMON
           (connect to cake.whatbox.ca[85.17.132.67]:25: Connection timed out)
                                         jvanderlinden@cake.whatbox.ca
 
E03371645      2810 Wed Dec  5 02:24:12  MAILER-DAEMON
(host mailx.hoster.ru[195.128.50.36] said: 451 Greylisting is in progress. Please, delay the message for at least 15 minutes before retry. (in reply to DATA command))
                                         7CC57E9@hotsys.ru
 
ED908168B      2843 Wed Dec  5 02:27:41  MAILER-DAEMON
(Host or domain name not found. Name service error for name=anjcs.com type=MX: Host not found, try again)
                                         F7A584D@anjcs.com
 
ED3BD1748      3667 Thu Dec  6 18:06:37  MAILER-DAEMON
     (connect to mx3.mail.yahoo.co.jp[183.79.57.237]:25: Connection timed out)
                                         0parentage5LenardrDooley@yahoo.co.jp
 
E31621754      3070 Thu Dec  6 21:55:56  MAILER-DAEMON
(host mail-fwd.mx.g19.rapidsite.net[199.239.254.18] said: 451 Could not load DRD for domain (sailmaker.com) rcpt (alberwickcolon@sailmaker.com) (in reply to RCPT TO command))
                                         alberwickcolon@sailmaker.com
 
E281A16A2      3614 Tue Dec  4 21:49:05  MAILER-DAEMON
           (connect to ns202330.ovh.net[91.121.145.21]:25: Connection refused)
                                         heeepp56@ns202330.ovh.net
 
ED1FC1787      2928 Thu Dec  6 21:01:48  MAILER-DAEMON
(Host or domain name not found. Name service error for name=gdaccountingservices.com type=MX: Host not found, try again)
                                         2B0B1FF@gdaccountingservices.com
 
EE2D61707      3783 Wed Dec  5 16:30:15  MAILER-DAEMON
         (connect to mx1.securebank.com[10.42.23.11]:25: Connection timed out)
                                         message@securebank.com
 
EE5221741      2899 Wed Dec  5 16:40:29  MAILER-DAEMON
          (connect to inforcentral.com[82.98.86.172]:25: Connection timed out)
                                         3175449B@inforcentral.com
 
8E751168C      3619 Tue Dec  4 20:10:46  MAILER-DAEMON
               (connect to ftp.cmp.cl[200.72.11.132]:25: Connection timed out)
                                         ly.flees@ftp.cmp.cl
 
8A9AF16DA      7580 Wed Dec  5 06:20:15  MAILER-DAEMON
              (connect to mx1.rural.com[10.42.23.11]:25: Connection timed out)
                                         aldohey@rural.com
 
8323E1644      2898 Tue Dec  4 19:35:38  MAILER-DAEMON
(Host or domain name not found. Name service error for name=hutchisonbuilders.co.nz type=MX: Host not found, try again)
                                         FD9C50B98@hutchisonbuilders.co.nz
 
8DB69170C      2885 Wed Dec  5 03:13:58  MAILER-DAEMON
(connect to ALT2.ASPMX.L.GOOGLE.COM[2a00:1450:4008:c01::1a]:25: No route to host)
                                         4CD4B21C@aethon.co.uk
 
87A1A174B      2846 Wed Dec  5 07:19:45  MAILER-DAEMON
               (connect to mail.protoncy.gr[195.46.5.82]:25: No route to host)
                                         EA5EC91@protoncy.gr
 
8291D1AE7      2845 Fri Dec  7 12:24:54  MAILER-DAEMON
               (connect to lucanux.com[141.8.224.25]:25: Connection timed out)
                                         D5D0AB3D@lucanux.com
 
8F26A15A0      2819 Wed Dec  5 00:54:16  MAILER-DAEMON
(host mail.premix.se[80.68.123.244] said: 450 Requested mail action not taken: mailbox unavailable (in reply to end of DATA command))
                                         6C403FB@premix.se
 
6CADC179C      2855 Fri Dec  7 02:30:58  MAILER-DAEMON
(Host or domain name not found. Name service error for name=gwconsumer.com type=MX: Host not found, try again)
                                         490DC7180@gwconsumer.com
 
622FB172C      3691 Thu Dec  6 19:51:49  MAILER-DAEMON
(host eagle135.startdedicated.com[69.64.34.106] said: 451 Temporary local problem - please try later (in reply to RCPT TO command))
                                         rd743@eagle135.startdedicated.com
 
631CB16D1      2916 Tue Dec  4 21:27:22  MAILER-DAEMON
(host mail-fwd.mx.g19.rapidsite.net[199.239.254.18] said: 451 Could not load DRD for domain (lakesideguitars.com) rcpt (5d693aa2@lakesideguitars.com) (in reply to RCPT TO command))
                                         5D693AA2@lakesideguitars.com
 
64ABD16D7      3618 Wed Dec  5 00:16:09  MAILER-DAEMON
(host condor.narzan.com[212.96.101.66] said: 450 4.1.1 <lxoda@condor.narzan.com>: Recipient address rejected: User unknown in local recipient table (in reply to RCPT TO command))
                                         lxoda@condor.narzan.com
 
6F82A127A      3314 Fri Dec  7 15:13:01  MAILER-DAEMON
         (connect to mx1.securebank.com[10.42.23.11]:25: Connection timed out)
                                         message@securebank.com
 
68092158E      3913 Tue Dec  4 21:16:23  MAILER-DAEMON
(host static-ip-188-138-96-241.inaddr.ip-pool.com[188.138.96.241] said: 451 4.7.1 Service unavailable - try again later (in reply to MAIL FROM command))
                                         chang3482@static-ip-188-138-96-241.inaddr.ip-pool.com
 
680D4174C      2848 Wed Dec  5 19:30:31  MAILER-DAEMON
(Host or domain name not found. Name service error for name=mastermakeover.com type=MX: Host not found, try again)
                                         C8FFFEB@mastermakeover.com
 
A0704159A      3825 Tue Dec  4 20:31:16  MAILER-DAEMON
(connect to static.16.105.9.5.clients.your-server.de[5.9.105.16]:25: Connection refused)
                                         foodbank@static.16.105.9.5.clients.your-server.de
 
AE73E162A      2862 Tue Dec  4 19:31:50  MAILER-DAEMON
       (connect to kamichijackson.com[208.91.197.27]:25: Connection timed out)
                                         85AA5E7@kamichijackson.com
 
AC0081702      2880 Thu Dec  6 16:15:31  MAILER-DAEMON
  (connect to pyramidpublication.com[208.91.197.101]:25: Connection timed out)
                                         7BFD4D6BB@pyramidpublication.com
 
A40B717CD      3666 Thu Dec  6 23:30:38  MAILER-DAEMON
         (connect to hosting.netrator.pl[195.110.48.2]:25: Connection refused)
                                         ukabctravelm@hosting.netrator.pl
 
592561640      3818 Tue Dec  4 19:55:52  MAILER-DAEMON
(connect to host-88-215-138-122.stv.ru[88.215.138.122]:25: Connection refused)
                                         billingsdd@host-88-215-138-122.stv.ru
 
57999179B      3608 Fri Dec  7 04:17:18  MAILER-DAEMON
(connect to reverse-89-106-12-63.turkticaret.net[89.106.14.231]:25: Connection refused)
                                         matthias.horn@reverse-89-106-12-63.turkticaret.net
 
544F217B3      2842 Fri Dec  7 10:23:52  MAILER-DAEMON
(host mailx.hoster.ru[195.128.50.36] said: 451 Greylisting is in progress. Please, delay the message for at least 15 minutes before retry. (in reply to DATA command))
                                         4D02F55@higea.ru
 
-- 263 Kbytes in 74 Requests.

Again, there is not a SINGLE address in there I recognize.

Running postcat on a randomly selected message:

*** ENVELOPE RECORDS deferred/A/AE73E162A ***
message_size:            2862             225               1               0            2862
message_arrival_time: Tue Dec  4 20:31:50 2012
create_time: Tue Dec  4 20:31:50 2012
named_attribute: log_message_origin=local
named_attribute: trace_flags=0
sender: 
original_recipient: 85AA5E7@kamichijackson.com
recipient: 85AA5E7@kamichijackson.com
*** MESSAGE CONTENTS deferred/A/AE73E162A ***
Received: by dedi-fr-57196.op-net.com (Postfix)
        id AE73E162A; Tue,  4 Dec 2012 20:31:50 +0100 (CET)
Date: Tue,  4 Dec 2012 20:31:50 +0100 (CET)
From: MAILER-DAEMON@dedi-fr-57196.op-net.com (Mail Delivery System)
Subject: Undelivered Mail Returned to Sender
To: 85AA5E7@kamichijackson.com
Auto-Submitted: auto-replied
MIME-Version: 1.0
Content-Type: multipart/report; report-type=delivery-status;
        boundary="7182C1584.1354649510/dedi-fr-57196.op-net.com"
Content-Transfer-Encoding: 8bit
Message-Id: <20121204193150.AE73E162A@dedi-fr-57196.op-net.com>
 
This is a MIME-encapsulated message.
 
--7182C1584.1354649510/dedi-fr-57196.op-net.com
Content-Description: Notification
Content-Type: text/plain; charset=us-ascii
 
This is the mail system at host dedi-fr-57196.op-net.com.
 
I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.
 
For further assistance, please send mail to postmaster.
 
If you do so, please include this problem report. You can
delete your own text from the attached returned message.
 
                   The mail system
 
<BOUNCE@dedi-fr-57196.op-net.com> (expanded from <pamltup@nothing-less.net>):
    unknown user: "bounce"
 
--7182C1584.1354649510/dedi-fr-57196.op-net.com
Content-Description: Delivery report
Content-Type: message/delivery-status
 
Reporting-MTA: dns; dedi-fr-57196.op-net.com
X-Postfix-Queue-ID: 7182C1584
X-Postfix-Sender: rfc822; 85AA5E7@kamichijackson.com
Arrival-Date: Tue,  4 Dec 2012 20:31:50 +0100 (CET)
 
Final-Recipient: rfc822; BOUNCE@dedi-fr-57196.op-net.com
Original-Recipient: rfc822;pamltup@nothing-less.net
Action: failed
Status: 5.1.1
Diagnostic-Code: X-Postfix; unknown user: "bounce"
 
--7182C1584.1354649510/dedi-fr-57196.op-net.com
Content-Description: Undelivered Message
Content-Type: message/rfc822
Content-Transfer-Encoding: 8bit
 
Return-Path: <85AA5E7@kamichijackson.com>
Received: from [62.94.156.19] (unknown [62.94.156.19])
        by dedi-fr-57196.op-net.com (Postfix) with SMTP id 7182C1584
        for <pamltup@nothing-less.net>; Tue,  4 Dec 2012 20:31:50 +0100 (CET)
From: "Order Viagara" <85AA5E7@kamichijackson.com>
Subject: Ultra fast delivery
To: <pamltup@nothing-less.net>
List-Unsubscribe: <mailto:158578774DD5A10F@politikcity.de>
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; chars="iso-8859-1"
Date: Tue, 04 Dec 2012 20:31:16 +0200
Message-ID: <20121204203116.1C1B162EC572265661F4E.CA6A1@LEODARI-BBB2A69>
 
Yo, friend pamltup!
Ship worldwide
 
Keep your body in balance! Buy meds from us!
 
** Porpecia - 0.21$
** Levitr - 1.63$
++ Cialis - 1.73$
-- Viarga - 0.53$
 
Bad heathcare getting you sick and tired? Our pharmacy will give you decent help for less money!
 
http://ELZw.doctorrayo.ru/
 
--7182C1584.1354649510/dedi-fr-57196.op-net.com--
*** HEADER EXTRACTED deferred/A/AE73E162A ***
named_attribute: encoding=8bit
*** MESSAGE FILE END deferred/A/AE73E162A ***

Now this is interesting! It's most definitely SPAM, and nothing-less.net is one of the domains on my server. I've disabled it for now - does that mean that user was compromised, maybe hacked into or something?

Sat, 12/08/2012 - 08:33
Locutus

The example you posted looks more like a non-deliverable reply being sent out in reply to a spam that was sent TO your system, i.e. the user "pamltup@nothing-less.net".

The postqueue excerpt looks like all those mails it's trying to deliver are NDRs.

My impression from that info is that your server is under a massive incoming spam attack, to non-existent addresses, and that it is trying to return NDRs to all of them.

Suggestion would be to prevent Postfix from sending NDRs (need to look up though how to do that - haven't tried that before) for the time being, and then clearing the queue from everything that's coming from your MAILER-DAEMON.

But: This NDR thing should not be responsible for the MASSIVE system load of over 200 you're seeing. Postfix should surely be able to handle a few hundred mails in the queue without overloading the system that massively. Maybe a web script on your system is being abused in an attempt to send spam to local addresses. You might want to install "atop" and see which process uses the most CPU.

Also install that Linux Malware Detect I mentioned and have it scan your web directories. Shut down Apache if required while doing so, if the system load doesn't decrease.

Sat, 12/08/2012 - 12:21
nothingless

Thanks so much for your reply! I've turned off NDRs for now using the instructions found here:

http://www.linuxquestions.org/questions/linux-server-73/disable-ndr-on-p... - seemed to work OK, restarted Postfix after I applied the changes.

I have now also installed Linux Malware Detect and am running a scan on the nothing-less domain, it might take a while but hopefully that will throw something up. Any ideas for how else I might investigate the crazy loads? Anything else I could do to optimize the server?

Sat, 12/08/2012 - 12:39
nothingless

Hmm, it didn't find anything:

maldet --scan-all /home/nothing
Linux Malware Detect v1.4.1
            (C) 2002-2011, R-fx Networks <proj@r-fx.org>
            (C) 2011, Ryan MacDonald <ryan@r-fx.org>
inotifywait (C) 2007, Rohan McGovern <rohan@mcgovern.id.au>
This program may be freely redistributed under the terms of the GNU GPL v2
 
maldet(5569): {scan} signatures loaded: 10427 (8559 MD5 / 1868 HEX)
maldet(5569): {scan} building file list for /home/nothing, this might take awhile...
maldet(5569): {scan} file list completed, found 18710 files...
maldet(5569): {scan} found ClamAV clamscan binary, using as scanner engine...
maldet(5569): {scan} scan of /home/nothing (18710 files) in progress...
 
maldet(5569): {scan} scan completed on /home/nothing: files 18710, malware hits 0, cleaned hits 0
maldet(5569): {scan} scan report saved, to view run: maldet --report 120812-1909.5569

Tried it on the disneysc one as well and nothing found. I'll go through the rest of my 15+ sites just in case too though.

I tried installed atop and it seems to be working in spite of a little error that popped up when I first ran it. Here's what it output, anything there catch your eye as being wrong or something to look into?

http://pastie.org/5499331

Sat, 12/08/2012 - 13:20
Locutus

Okay, if LMD doesn't turn up anything, that's a good first step towards securing that your site isn't compromised. You might check the Apache log if there are unusually many accesses to some page.

As for atop, nothing unusual right now. You might wanna watch it if CPL goes up again, and see which processes have high CPU, or if there is high disk I/O or disk wait.

Sat, 12/08/2012 - 17:33
nothingless

Hmm. I think the load must have crept up while I was away from the computer, as my server crashed again. Anything I can check for now that I'm rebooting it? Which Apache log can I check?

Sat, 12/08/2012 - 19:02
nothingless

OK, the loads have crept up a little again so I checked atop again:

 01:51:17 up  1:27,  1 user,  load average: 14.08, 8.17, 5.82

Results here: http://pastie.org/5500478

Does that show anything strange?

Sun, 12/09/2012 - 04:08
Locutus

It's always a bit difficult to give ideas about a dynamic process like system load creeping up by looking at a snapshot of performance data. :) Three things from here:

The disk performance stats indicate that you're using software RAID, which looks quite idle in your snapshot. The physical disks though show high and constant activity. It's possible that your RAID is performing a check or rebuild or something, you can check this with cat /proc/mdstat. You might want to post its output here.

atop writes historical performance data to files named /var/log/atop*.log. You can look at these with atop -r filename. Then you can navigate through the 10-minute interval snapshots using T and Shift-T.

I can offer to take a look at your system, if you trust me sufficiently to give me root login. If you'd like that, just post an instant messenger screen name here and I'll contact you there.

Sun, 12/09/2012 - 05:04
nothingless

I would be so happy if you could take a look at my system! I don't use IM, but my email address is fivebyfive@gmail.com - if you give me a shout on there I'll email it through. Thank you!

Sun, 12/09/2012 - 09:12
Locutus

Sent a mail, waiting for reply. :)

Sun, 12/09/2012 - 11:59
nothingless

Thanks, info sent! :)

Sun, 12/09/2012 - 12:21
Locutus

Okay, here's what I found out so far:

The RAID arrays seem to be okay, no rebuild in progress or anything. /dev/md1 and its physical disks are very busy, very high write rates, caused by php-cgi process of users "adriana" and "erin".

You might want to check what kind of web pages those have running and why they produce so many disk writes. The Apache access logs suggest that those pages get very many hits.

Unfortunately, your version of atop does not write log files and is missing some process audit functions. Might have to do with the hoster you use.

You might want to install the tool "apachetop". I could not find it with Yum, and unfortunately I'm not too familiar with yum-based distros.

Sun, 12/09/2012 - 12:22
nothingless

Yeah, they're 2 very busy Wordpress sites. All my sites are WP, and I'm using W3 Total Cache with Page Disk Caching turned on. Could that cause this? Do you have any tips for how I might better optimize or even upgrade my server to cope with these kinds of write rates?

Sun, 12/09/2012 - 12:27
Locutus

Users "disneysc", "halloween", "fashions" and "nikki" now also exhibit very high write rates. You might wanna check their web pages too.

Sun, 12/09/2012 - 12:31
Locutus

Well, your box has 32 GB of memory, why not install a PHP cache on several levels, like an instruction cache (xcache), and "memcached" which Wordpress should support.

If you have very fast HDDs that can cope with those high write rates, e.g. RAID-0 arrays with multiple disks, or SSDs, it is okay to use extensive disk caching, but since you're experiencing problems with system load going through the roof, you should try memory caches instead of disk caches.

This does not solve the spam issue, but that's another thing. Finding out where those come from is separate from the high load issue. Maybe let's first try to fix the load issue.

Sun, 12/09/2012 - 12:36
Locutus

I can see CPU Wait values going up a lot, which implies that your processes are bogged down by waiting for probably Disk I/O. Another indication that your disk caches are excessive.

The Apache logs seem to indicate that the Wordpress sites in question host photos or other images? It's not overly suggested to cache those on the disk. That's basically like copying them on every request. Caching is useful for compiled PHP code or dynamic pages that contain mostly text and don't change very often. Try disabling any disk caching for starters.

Sun, 12/09/2012 - 12:36
nothingless

OK - so you think I should try turning off disk caches? I've installed APC a few weeks ago, does that work with memcache or xcache too, do you know?

Sun, 12/09/2012 - 12:39
Locutus

Please reload, I edited my previous post, simultaneously when you sent yours.

Also: Do those Wordpress sites have any kind of contact form or other way of sending you emails? That might be what is (ab)used for sending the spam to you.

You might try disabling any contact forms while you test that theory.

Sun, 12/09/2012 - 12:39
nothingless

OK, I've turned off page disc caching for 'adriana' and changed it to APC instead - where would I go to see if that's making a difference?

Sun, 12/09/2012 - 12:41
Locutus

Also I'm not sure if Wordpress is the right software to host a bunch of high-traffic photo sites, performance-wise. There's probably better suited software for that, though I don't know off the bat what you could try.

About disk cache: You should turn it off for ALL your WP sites, and then take a look at atop if the disk activity dies down and CPL goes down too.

Sun, 12/09/2012 - 12:43
nothingless

Ah ok sorry, I've reloaded now. Yes, they are all image galleries - 95% of the content is images. OK, I'll try disabling page disk caching altogether, is that better than switching it to APC? And I'll take down the forms too just to see if that helps. Fingers crossed!

Sun, 12/09/2012 - 12:44
Locutus

Also, take a look at the "cache" value in the MEM row. That indicates how much of your physical memory is used for file system caches. As you can see, it's at about 2.3 GB already, which means that probably most of the photos that users request are in the memory cache.

So it's possible you don't even need xcache or memcached (considering you have an 8-core CPU which is bored out of its mind for the most part), but using those 32 GB RAM as memory cache (which Linux does automatically) might suffice.

Sun, 12/09/2012 - 12:47
Locutus

Yes, you don't need ANY page disk cache if you host mostly galleries. As you can see, it's on the contrary rather counter-productive. Your system spends most of its time writing to the cache, and the processes have to wait for the disk to write that cache.

Reading from the disk cache won't bring an advantage anyway, since that 8-core CPU and the 32 GB memory, when properly used as file system cache, deliver the gallery pages much quicker than reading the stuff from your disk cache.

Mon, 12/10/2012 - 05:29
nothingless

Well, I have to say I think you've hit the nail on the head!! I switched off all my disk caching last night, and when I checked my server again this morning, it was 1) still online!! and 2) the loads were all around 1.5 which is SO much more reasonable than 200! :-)

Thank you so much for your help - I can do a little further tweaking now, and I might try installing xcache or memcache also just to speed things up, but it really seems like the disk writing speed was the bottleneck, and without that, the server seems happy enough. Wasn't a SPAM issue after all then! Thanks again for all your help!!

Mon, 12/10/2012 - 09:28
nothingless

OK - so here's a quick update! I installed and configured Memcache, alongside APC which was already installed. I made sure all of my sites are using W3 Total Cache as a Wordpress plugin, with both page caching and minify set to APC, and object & database caching set to Memcache, so nothing is writing caches to disk. The sites that had WP Super Cache installed, have been switched to W3 Total Cache, with Supercache deleted completely. This seems to have helped a lot, my loads are a lot less and the server is less prone to crashing.

I've been checking atop, and although things look better, the rows for DSK are still in the red often enough, with busy listed around 80-90% most of the time, only sometimes falling to 60% or less, and sometimes spiking to 100% (when that happens, all my sites immediately start throwing 500 Internal Server Errors). Where should I look next to see which users are writing to disk so much, or how could I troubleshoot this next?

Mon, 12/10/2012 - 10:12
Locutus

The list below the generic statistics show which process is doing what. When you press g to switch to "generic data", then shift+d to order the list by disk activity, you should be able to see who's the culprit.

Mon, 12/10/2012 - 10:21
nothingless

Hmm, it seems to be the 'apache' process owned by root that does a lit of writing? Could you take a look and confirm I'm reading it right?

Mon, 12/10/2012 - 11:12
Locutus

I'm watching atop now for about a few minutes, but disk write rates are well below 1 MB/s, with an occasional "spike" of 2 MB/s caused by MySQL. Nothing anywhere near the red zone.

Just now a process named "bw.pl", possibly triggered by cron, had a quick CPU spike and read a few hundred MBs.

Nothing out of the ordinary so far.

Mon, 12/10/2012 - 11:25
nothingless

Ok thank you! I actually made a little tweak to my APC settings after my last post, changing the apc.mmap_file_mask to /dev/zero as that was supposed to lower disk activity, and it really has, everything looks pretty good at the moment. Thanks again for all your help - I'll keep monitoring it over the next few days to see how I go!

Mon, 12/10/2012 - 11:29
Locutus

Rogerroger, and you're welcome! :)

(Oh, and don't mind the little Trojan I left somewhere on your server. ;) )

Mon, 12/10/2012 - 11:30
nothingless

LOL! Thanks, I always enjoy little surprises like that. ;)

Tue, 12/11/2012 - 15:30
nothingless

Back with another quick question!! My server's going so far so good:

uptime
 22:15:47 up 1 day,  6:09,  2 users,  load average: 0.14, 0.29, 0.35

I've been monitoring things via atop, which all seems good, the disk writes are all around 20-30%. However, because I'm now actually using more of my RAM, I've seen that creep up and up over the past 2 days. At the moment, the MEM row looks like this:

MEM |  tot    31.3G  | free  521.1M  |  cache  22.6G  |  dirty   7.9M |  buff    3.0G  |  slab    1.5G  |               |                |

The value for free has gone from 25+ GB to 500MB over the past 48 hours. Obviously I want to use all of the RAM I have, but is this something to worry about? Do I need to make some changes to make sure the cache doesn't fill up completely, or, will Linux/APC/Memcache all be smart enough to automatically clear some of the cache when it needs more space?

Tue, 12/11/2012 - 18:14
Locutus

Nothing to worry about there. RAM used as cache IS basically free and can be reassigned to applications as soon as they require it. Linux follows the philosophy "free ram is wasted ram" and uses as much of it as cache as it can get. :)

Topic locked