Email Random Problem (virtualmin gpl)

30 posts / 0 new
Last post
#1 Thu, 05/12/2011 - 10:17
luisgrodriguez

Email Random Problem (virtualmin gpl)

Hi I have installed a VPS with virtualmin gpl for administer my domains, i have 5 domains but only 2 manage mail services... everythin is ok but for monitoring I recently installed a script to get the resume of mail activity of day, my question is about random errors reported...

this this the output...

the first part of the mail for idea of the daily traffic

Grand Totals------------ messages 292 received 1099 delivered 313 forwarded 3 deferred (3 deferrals) 43 bounced 7 rejected (0%) 0 reject warnings 0 held 0 discarded (0%)

and the detail of boucend and fails system

message deferral detail----------------------- local (total: 1) 1 local: fatal: execl /bin/sh: Too many open files in system qmgr (total: 1) 1 unknown mail transport error smtp (total: 1) 1 //postgrey.schweikert.ch/help/quirsa.com.html (in reply to RCP...

message bounce detail (by relay)

local (total: 42) 30 "/usr/bin/procmail-wrapper -o -a $DOMAIN -d $LOGNAME" 3 mail forwarding loop for dcoxaj.sysvirtuales@admin.sysvirtuales... 2 libtermcap.so.2: cannot open shared object file: Error 23 2 mail forwarding loop for info.sysvirtuales@admin.sysvirtuales.com 1 libc.so.6: cannot open shared object file: Error 23 1 procmail: Unknown user "ricardo.int-rosenberg" 1 procmail: Unknown user "ventas.int-rosenberg" 1 Program failure (127) of "/etc/webmin/virtual-server/lookup-dom... 1 Program failure (-11) of "/etc/webmin/virtual-server/lookup-dom...none (total: 1) 1 User unknown in virtual alias table

I am beggining in postfix deployment, if you give me some help with these error I will appreciate...

tks...

Thu, 05/12/2011 - 11:48
andreychek

Howdy,

Well, it's not unusual to see some email errors... a lot of what you're seeing there may simply be due to spammers guessing at incorrect usernames.

A few of those you may want to check out... what you'd want to do is look in the mail logs, and determine the context of when they're occurring.

For example, "Too many open files in system qmgr" and "ibtermcap.so.2: cannot open shared object file" are a little unusual. My suggestion there would be to open /var/log/mail.log or /var/log/maillog, locate where those errors are, and determine what exactly was happening at that time that could have contributed to those errors.

-Eric

Sun, 05/22/2011 - 17:33
luisgrodriguez

Hi this one of the error showeds in maillog...

May 20 17:07:03 admin dovecot: execv(/usr/libexec/dovecot/pop3-login) failed: Too many open files in system May 20 17:08:04 admin dovecot: execv(/usr/libexec/dovecot/pop3-login) failed: Too many open files in system

I think that the problem is some configuration on smtp server or somethin.. the server are a VPS wiht 2 GB of ram and the number or email for each day are aprox 900 to 1000 mails... other problem is that I have aprox 30 forward rules for user in server (incoming rules and outgoing rules) some days I habe more tha 100 errors that this..

May 19 15:55:04 admin postfix/local[11544]: D606010068020: to=ricardo.int-rosenberg@admin.sysvirtuales.com, orig_to=analucia@int-rosenberg.net, relay=local, delay=0.98, delays=0.94/0/0/0.05, dsn=5.3.0, status=bounced (Command died with status 127: "/usr/bin/procmail-wrapper -o -a $DOMAIN -d $LOGNAME") May 19 15:55:04 admin postfix/local[4022]: D7FFE10068022: to=ventas.int-rosenberg@admin.sysvirtuales.com, orig_to=aracely@int-rosenberg.net, relay=local, delay=0.98, delays=0.93/0/0/0.05, dsn=5.3.0, status=bounced (Command died with status 245: "/usr/bin/procmail-wrapper -o -a $DOMAIN -d $LOGNAME")

this problem are each day with randoms account sometimes do and sometimes not...

I need some help to solve this...

any help with tuning my server are apreciated...

Sun, 05/22/2011 - 17:59
andreychek

Howdy,

You may be seeing a lot of traffic in Dovecot, enough to make it hit the open file limit for your server.

You can always try bumping up your open file limit in /etc/security/limits.conf.

As far as your email deliver issue goes -- what do you see if you type this command:

ls -l /usr/bin/procmail-wrapper

Sun, 05/22/2011 - 18:07
luisgrodriguez

tks ..

in limits.conf I have setup this... * soft nofile 4096 * hard nofile 4096

and for the command that you gave me the output is

-rwsr-sr-x 1 root root 4994 May 9 2007 /usr/bin/procmail-wrapper

Sun, 05/22/2011 - 18:22
andreychek

Hmm, I have a suspicion that either those settings aren't being seen, or you're getting a crazy amount of traffic :-)

One thing you might try is to put the following in your limits file, then restart Dovecot:

dovecot hard nproc 8192
@dovecot hard nproc 8192
Mon, 05/23/2011 - 00:34
luisgrodriguez

I will try this improvement, and I will see the perfomance tomorrow ... how I can see the real traffic to my server?? there are some tool to monitor this?

Mon, 05/23/2011 - 08:25
andreychek

Well, it wouldn't likely be a high amount of bandwidth. However, you can see how many Dovecot processes are running at any given time with this command:

ps auxw | grep dovecot | wc -l

Sat, 05/28/2011 - 20:03
luisgrodriguez

Hi the problem was continue...

there is my posfconf -n: please check if I have something worng

postconf -n alias_database = hash:/etc/aliases alias_maps = hash:/etc/aliases bounce_queue_lifetime = 120m broken_sasl_auth_clients = yes command_directory = /usr/sbin config_directory = /etc/postfix daemon_directory = /usr/libexec/postfix debug_peer_level = 2 default_destination_concurrency_limit = 25 disable_vrfy_command = yes home_mailbox = Maildir/ html_directory = no inet_interfaces = all local_destination_concurrency_limit = 4 mail_owner = postfix mailbox_command = /usr/bin/procmail-wrapper -o -a $DOMAIN -d $LOGNAME mailq_path = /usr/bin/mailq.postfix manpage_directory = /usr/share/man maximal_backoff_time = 2000s maximal_queue_lifetime = 120m message_size_limit = 31457280 minimal_backoff_time = 500s mydestination = $myhostname, localhost.$mydomain, localhost, admin.sysvirtuales.com newaliases_path = /usr/bin/newaliases.postfix queue_directory = /var/spool/postfix queue_run_delay = 500s readme_directory = /usr/share/doc/postfix-2.3.3/README_FILES sample_directory = /usr/share/doc/postfix-2.3.3/samples sender_bcc_maps = hash:/etc/postfix/bcc sendmail_path = /usr/sbin/sendmail.postfix setgid_group = postdrop smtpd_banner = $myhostname ESMTP $mail_name ($mail_version) smtpd_error_sleep_time = 2s smtpd_hard_error_limit = 30 smtpd_helo_required = yes smtpd_recipient_restrictions = permit_mynetworks permit_sasl_authenticated reject_unauth_destination smtpd_sasl_auth_enable = yes smtpd_sasl_security_options = noanonymous smtpd_soft_error_limit = 15 unknown_local_recipient_reject_code = 550 virtual_alias_maps = hash:/etc/postfix/virtual

and I have a new problem in logs May 27 12:37:48 admin postfix/smtpd[26356]: warning: 67A0CCA50B8C: queue file size limit exceeded May 27 13:02:24 admin postfix/smtpd[26356]: warning: CA23FCA50B8C: queue file size limit exceeded May 27 13:27:39 admin postfix/smtpd[26356]: warning: CF33ACA50B8C: queue file size limit exceeded May 27 13:54:04 admin postfix/smtpd[26356]: warning: 0FC32CA50B8C: queue file size limit exceeded

and other thing a minutes ago the server was stop for too many open files...

I am really preocupated... the problem is no always but generate me problems with my clients...

tks for any possible help

Sun, 05/29/2011 - 17:08
andreychek

Howdy,

Well, I'm not entirely certain what might cause those errors -- but after a little Googling, one idea would be to try commenting out the "message_size_limit" line in your /etc/postfix/main.cf, then restart Postfix.

I'm curious if those emails go through after that, or whether you continue to receive the queue file size limit errors.

-Eric

Wed, 06/01/2011 - 10:21
luisgrodriguez

Hi the queue problem was fixed, but now I have several logs with this..

99 "/usr/bin/procmail-wrapper -o -a $DOMAIN -d $LOGNAME"

I see that the problem is that too many open files are on system...

I was modified in limits.conf the soft and hard options to 32780...

if somebody can help me with this....

tks

Wed, 06/01/2011 - 10:39
andreychek

Hmm, that doesn't look like an error message... it's normal to see "/usr/bin/procmail-wrapper -o -a $DOMAIN -d $LOGNAME" -- that's the command that delivers all of your email.

Are you seeing any associated error messages with that when the email is being delivered?

-Eric

Wed, 06/01/2011 - 11:38
luisgrodriguez

the error was in a mail report that I receive every day...

there are bounce mails..

message bounce detail (by relay)-------------------------------- local (total: 146) 99 "/usr/bin/procmail-wrapper -o -a $DOMAIN -d $LOGNAME" 13 libdl.so.2: cannot open shared object file: Error 23 12 libtermcap.so.2: cannot open shared object file: Error 23 10 libc.so.6: cannot open shared object file: Error 23 5 sh: /usr/bin/procmail-wrapper: Too many open files in system 4 Program failure (-11) of "/etc/webmin/virtual-server/lookup-dom... 1 libnsl.so. 1: cannot open shared object file: Error 23 1 procmail: Unknown user "ricardo.int-rosenberg" 1 unknown user: "ventas.int-rosenberg"none (total: 1) 1 mail for gmail.sysvirtuales.com loops back to myself

Wed, 06/01/2011 - 15:15
andreychek

You may want to make sure that there's no limits setup in /etc/security/limits.conf that may be affecting mail delivery.

If you see any open file limits set in there, you may want to temporarily disable those, restart Postfix, and see if that helps.

-Eric

Wed, 06/01/2011 - 16:04
luisgrodriguez

Hi Eric I am really help with this server... could you give some special support to see the server? my clients are loosing mails for this.....

Wed, 06/01/2011 - 16:09
luisgrodriguez

I am really need that somebody check all configuration in server to setup the correct settings are enabled.

Mon, 06/13/2011 - 12:16
luisgrodriguez

Hi everybody my problem still happen... I was modify the limits.conf... but the error Too many open files still happen... are you have a similar problem...

I have a VPS with 2GB of ram....

Mon, 06/13/2011 - 13:10
andreychek

Well, I'm unfortunately not sure what the issue there is... I see that you posted a note in the Jobs forum for Scott to take a look, was he able to poke around on your server? Did he offer any thoughts/advice?

-Eric

Mon, 06/13/2011 - 13:32
luisgrodriguez

yes he told me about policyd implemetation... but the problem I think that not only the smtp request per hour in my system... I think that the problem are the dovecot conf... or http conf but I don't be sure...

Mon, 06/13/2011 - 14:49
luisgrodriguez

I have a question about that I run this command

> lsof -n|grep -oE '^[a-z]+'|sort|uniq -c|sort -n

and the result was

> lsof -n|grep -oE '^[a-z]+'|sort|uniq -c|sort -n
8 uniq
9 grep
9 init
14 syslogd
16 monit
16 sort
17 udevd
21 sh
22 collectin
22 dbus
24 lookup
25 lsof
25 munin
28 su
34 proftpd
35 pickup
35 qmgr
36 anvil
36 proxymap
38 postgrey
43 named
45 crond
49 bash
50 smtpd
78 pop
82 dovecot
99 yum
125 master
141 php
145 mysqld
160 saslauthd
166 imap
191 sshd
244 miniserv
2640 httpd

when I run > lsof | awk '{print $NF}' | sort | wc -l

the number of files are betwen 4500 and 5800 when the number is high the problem is begin... I see in maximun open files in system..... cat /proc/sys/fs/file-max the number 262144

why the number of files open not to be more than 6000 on my system if the settings are 262144 ...

Mon, 06/13/2011 - 15:27
andreychek

My first thought is that there's still some lingering issue with the limits.conf... if you modify that file, it can sometimes require a reboot in order to apply the changes (or at least, being very certain that all the daemons in question have re-read that particular file).

You'll want to make certain that there's nothing at all in the limits.conf that is remotely mail related... Postfix, Saslauthd, Dovecot -- those all have users associated with the daemons, if any of them have entries in the limits.conf file, I'd be suspicious of them.

-Eric

Mon, 06/13/2011 - 15:31
luisgrodriguez

sorry for not say it... I was rebooted the server... ..

this is the actual text on limits.conf

  • soft nproc 2047
  • hard nproc 16384
  • soft nofile 1024
  • hard nofile 65536
Mon, 06/13/2011 - 15:36
andreychek

I'm not sure what the system does when you reach a "soft" limit -- just to rule that out as a possible issue, you may want to try raising your soft limits to values closer to the hard limits.

-Eric

Mon, 06/13/2011 - 15:42
luisgrodriguez

at moment I have 12 process

Mon, 06/13/2011 - 15:50
luisgrodriguez

do you mean set almost the same value to hard and to soft...

example..

soft 4096 hard 5120

Mon, 06/13/2011 - 15:52
andreychek

I'm just suggesting making the soft limits really high -- or removing them altogether for the time being :-)

I'm honestly not sure why you're experiencing the problems you're seeing, and at this point, I'm mostly just grasping at straws... but it's worth a try :-)

-Eric

Mon, 06/13/2011 - 15:55
luisgrodriguez

do you think that the problem could be hardware related... I am 2 GB of ram but this is a VPS, I have other server with 512 of ram and the files open are more than this server (only the httpd always have 4500 to 5000 files open...

Mon, 06/13/2011 - 15:59
andreychek

Well, I suppose anything is possible, but what you're experiencing sounds like a software problem.

-Eric

Mon, 06/13/2011 - 16:01
luisgrodriguez

Eric coul you help see the server directly?? maybe I am forgotten something to tell...

could you help in this way?

Mon, 06/13/2011 - 17:11
andreychek

I think your best bet would be to hire someone you can work with to solve this particular problem. You can post a note in the Jobs forum here, or there's a number of other sites on the Net that have sysadmins competing to do server work.

-Eric