Discussion:
[Pkg-exim4-users] sporadic invalid helo
Jonathan Addleman
2016-02-27 21:06:18 UTC
Permalink
Hello,

I've been using exim for many, many years, though rarely had to change
anything, so I'm not particularly well-versed in the config. Since
moving to a new host, I've been having troubles with sent mail either
being rejected, or silently ignored by many recipients. Most of the
time, it works, but occasionally not.

The bounce errors that I very occasionally get tell me that the HELO
identification was invalid, and I can see that exim is identifying
itself with just the hostname, not the full domain. (i.e. sepia, instead
of sepia.redowl.ca). The vast majority of the time, it does identify
with the FQDN.

I'm pretty sure the problem lies in
/etc/exim/conf.d/transport/10_exim4-config_transport-macros, with the
line REMOTE_SMTP_HELO_DATA=${lookup dnsdb
{ptr=$sending_ip_address}{$value}{$primary_hostname}}

When the lookup succeeds, the HELO uses the proper domain. But it seems
my host has a slightly wonky DNS server, and it occasionally fails. In
that case, it uses $primary_hostname. From the docs I've read,
(https://wiki.debian.org/PkgExim4UserFAQ#How_does_exim_find_out_its_host_name_to_use_in_HELO.2FEHLO.3F)
this value is not set, and should be getting the full domain, but it
seems to be only getting the hostname.

'hostname -f' does return the full sepia.redowl.ca name... Is there some
additional configuration that I might need to do so that exim gets that
same value?

Thanks for any suggestions you can offer. If it matters, I'm using the
most recent stable release, 4.84-8+deb8u2.
--
Jonathan Addleman - http://www.redowl.ca
J G Miller
2016-02-27 23:04:51 UTC
Permalink
At 16:06h, on Saturday, February 27, 2016,
in message <***@redowl.ca>,
on the subject of "[Pkg-exim4-users] sporadic invalid helo",
Jonathan Addleman explained -
Post by Jonathan Addleman
When the lookup succeeds, the HELO uses the proper domain. But it seems
my host has a slightly wonky DNS server, and it occasionally fails. In
that case, it uses $primary_hostname. From the docs I've read,
(https://wiki.debian.org/PkgExim4UserFAQ#How_does_exim_find_out_its_host_name_to_use_in_HELO.2FEHLO.3F)
this value is not set, and should be getting the full domain, but it
seems to be only getting the hostname.
Perhaps this will help ...

In

/etc/exim4/main/01_exim4-config_listmacrosdefs

I have

MAIN_HARDCODE_PRIMARY_HOSTNAME = host.fqdn

and in

/etc/exim4/main/02_exim4-config_options

I have

primary_hostname = MAIN_HARDCODE_PRIMARY_HOSTNAME

and also (but this is probably not relevant to your problem)

qualify_domain = MAIN_HARDCODE_PRIMARY_HOSTNAME

So you need to ensure that primary_hostname is set to something,
preferably via a "global" macro (global in the sense that it can be
reused for one or more "specific" macro instances) in 01_exim4-config_listmacrosdefs.
Jonathan Addleman
2016-03-02 03:37:27 UTC
Permalink
Post by J G Miller
So you need to ensure that primary_hostname is set to something,
preferably via a "global" macro (global in the sense that it can be
reused for one or more "specific" macro instances) in 01_exim4-config_listmacrosdefs.
Thanks, that helps a lot! I'm a little wary of this fix though, because
the docs say pretty clearly "Please refrain from using primary_hostname
unless you cannot avoid using it. It enhances the complexity of your
configuration and leads to error issues that are a hell to debug. ".

That said, I don't see where the problems might arise here, so I think
I'll do what you suggest, unless someone else can explain why it would
be a bad idea!
--
Jonathan Addleman - http://www.redowl.ca
J G Miller
2016-03-02 16:06:40 UTC
Permalink
At 22:37h, on Tuesday, March 01, 2016,
in message <***@redowl.ca>,
on the subject of "Re: [Pkg-exim4-users] sporadic invalid helo -- setting primary_hostname",
Jonathan Addleman reported --
Post by Jonathan Addleman
Thanks, that helps a lot! I'm a little wary of this fix though, because
the docs say pretty clearly "Please refrain from using primary_hostname
And did you read the sentences before that one?

QUOTE

Debian's exim4 default configuration does not set primary_hostname.

Exim then defaults to uname() to find the host name.

---> If that call only returns one component, <---

gethostbyname() or getipnodebyname() is used to obtain the fully qualified host name.

UNQUOTE

You stated in your original message

QUOTE

The bounce errors that I very occasionally get tell me that the HELO
identification was invalid, and I can see that exim is identifying
itself with

====>just the hostname<=====,

not the full domain. (i.e. sepia, instead of sepia.redowl.ca)

UNQUOTE

Therefore the problem is that occasionally the call to gethostbyname() or
getipnodebyname() is failing to get the full hostname with FQDN.

If you are afraid of setting primary_hostname, debug your system name calling
to ensure that gethostbyname() or getipnodebyname() return the fully qualified
hostname 100% of the time without failure.

That actually is a more worrying fundamental problem which you should be concerned
about that may well affect other things rather than the fix of setting primary_hostname.
Jonathan Addleman
2016-03-02 17:29:43 UTC
Permalink
Post by J G Miller
At 22:37h, on Tuesday, March 01, 2016,
on the subject of "Re: [Pkg-exim4-users] sporadic invalid helo -- setting primary_hostname",
Jonathan Addleman reported --
Post by Jonathan Addleman
Thanks, that helps a lot! I'm a little wary of this fix though, because
the docs say pretty clearly "Please refrain from using primary_hostname
And did you read the sentences before that one?
Indeed. It is worrisome. I'm at quite a loss as to troubleshooting it
though. It seems that hostname -f doesn't use gethostbyname() or
getipnodebyname() (at least as far as I can see from a strace). Are
there other command line tools that I could use to test things?

Also, I've tried a few things, but I haven't been able to get exim to
consistently send just the hostname without domain. I changed the line
in 10_exim4-config_transport-macros to read just
REMOTE_SMTP_HELO_DATA==$primary_hostname, but it seems to still return
the fqdn as expected every time I try. I don't know what might change in
those cases where it doesn't work.

Any suggestions are very welcome!
--
Jonathan Addleman - http://www.redowl.ca
J G Miller
2016-03-02 18:23:52 UTC
Permalink
At 12:29h, on Wednesday, March 02, 2016,
in message <***@redowl.ca>,
on the subject of "Re: [Pkg-exim4-users] sporadic invalid helo -- setting primary_hostname",
Jonathan Addleman explained --
Post by Jonathan Addleman
Indeed. It is worrisome. I'm at quite a loss as to troubleshooting it
though. It seems that hostname -f doesn't use gethostbyname() or
getipnodebyname() (at least as far as I can see from a strace). Are
there other command line tools that I could use to test things?
Only thing I can think of for testing at the command line is writing a
simple C program to use gethostbyname for your own host name.

Intermittent sporadic problems are always the hardest to fix.

It could possibly be related to system load and demands on your network
name lookup mechanism.

The most obvious nuisance to name lookups is nscd.

Are you by any chance running nscd with hosts cache enabled and
nsswitch.conf pointing to cache?

Probably not, but it needs to be eliminated just in case.

Are you running NIS?

Are you running named/bind9 for your local hosts?

And you should review the contents of /etc/nsswitch.conf anways since
gethostbyname consults /etc/nsswitch.conf to ascertain what mechanism(s)
to do the name lookup.

If you have an /etc/hosts with the FQDN host name properly defined,
perhaps just putting hosts before dns (if that is the order currently
used) could cure the problem, if indeed the problem is caused by an
occasional failure in using dns due to load, bind9 not running, or whatever.

Incidentally according to the Debian manual page for gethostbyname,
the maintainers of exim4 need to update the source code because
getnameinfo should be used instead.

QUOTE

The gethostbyname*(), gethostbyaddr*(), herror(), and hstrerror() functions are obsolete.

Applications should use getaddrinfo(3), getnameinfo(3), and gai_strerror(3) instead.

UNQUOTE

This is also highlighted as a SECURITY concern at

<http://blog.erratasec.COM/2015/01/you-shouldnt-be-using-gethostbyname.html>

QUOTE

Tuesday, January 27, 2015

You shouldn't be using gethostbyname() anyway

Today's GHOST vulnerability is in gethostbyname(), a Sockets API function
from the early 1980s. That function has been obsolete for a decade.

What you should be using is getaddrinfo() instead, a newer function that
can also handle IPv6.

UNQUOTE

This blog entry ironically resulted in a comment pertinent to Exim4 added by a reader.

QUOTE

celula_x said...

if it's old; wonder why stocked Debian's Exim still calls it =\

7:41 PM

UNQUOTE

If Debian developers/maintainers are concerned about security, are they lobbying
with the upstream Exim4 code authors/maintainers to get this changed?

And despite the dire warning on the Debian setting up Exim4 instructions
about not using MAIN_HARDCODE_PRIMARY_HOSTNAME. I have always set it on
my heavily customized Exim4 setup (now on four machines) since 2005 or
maybe earlier and never had a problem.
Jonathan Addleman
2016-03-02 19:16:23 UTC
Permalink
Post by J G Miller
At 12:29h, on Wednesday, March 02, 2016,
on the subject of "Re: [Pkg-exim4-users] sporadic invalid helo -- setting primary_hostname",
Jonathan Addleman explained --
Post by Jonathan Addleman
Indeed. It is worrisome. I'm at quite a loss as to troubleshooting it
though. It seems that hostname -f doesn't use gethostbyname() or
getipnodebyname() (at least as far as I can see from a strace). Are
there other command line tools that I could use to test things?
Only thing I can think of for testing at the command line is writing a
simple C program to use gethostbyname for your own host name.
That's what I feared...

I'm not using nscd, NIS, or bind9
nsswitch.conf has the default "hosts: files dns" line.

One thing that I thought *might* be related is that my hosts line showed
two 127.0.0.1 lines, one for localhost, and another for the actual
hostname and fqdn. Some googling showed that there are cases where that
could be problematic.. I changed the hostname one to 127.0.1.1, in any
case. Maybe it will fix something down the road.
Post by J G Miller
And despite the dire warning on the Debian setting up Exim4 instructions
about not using MAIN_HARDCODE_PRIMARY_HOSTNAME. I have always set it on
my heavily customized Exim4 setup (now on four machines) since 2005 or
maybe earlier and never had a problem.
I'm leaning pretty heavily towards doing that now. Thanks for your help!
--
Jonathan Addleman - http://www.redowl.ca
J G Miller
2016-03-02 20:10:17 UTC
Permalink
At 14:16h, on Wednesday, March 02, 2016,
in message <***@redowl.ca>,
on the subject of "Re: [Pkg-exim4-users] sporadic invalid helo -- setting primary_hostname",
Jonathan Addleman pondered -
Post by Jonathan Addleman
One thing that I thought *might* be related is that my hosts line showed
two 127.0.0.1 lines, one for localhost, and another for the actual
hostname and fqdn.
But do you also have an entry for the network interface device (not the loopback
127.0.0.1) that exim4 is listening on?
Post by Jonathan Addleman
Some googling showed that there are cases where that
could be problematic..
Yes having two possibly conflicting entries is a problem since how
does the name lookup know which is the one you really want to use? ;+}
Post by Jonathan Addleman
I changed the hostname one to 127.0.1.1
My suggestion would be in /etc/hosts to have

127.0.0.1 localhost.localdomain localhost

(localdomain is chosen rather than local so as not to conflict
with Avahi/Bonjour local) and then for your network interface
that exim4 is listening on

172.86.178.126 sepia.redowl.CA sepia

or whatever IP (non loopback) you do actually use.

So give that a try without doing the forced primary host name
to see if that makes any difference, and at minimum, it will
indicate whether or not the issue of intermittent bad name lookups
is coming from that or is in fact arising from a different misconfiguration.

Golden rule when trying to fix these problems (which often results in
other problems being discovered and fixed) -- only change one thing at a time
and see if it makes any difference.

Loading...