"Virtual server somedomain.com does not exist" when replicating Virtualmin domains.

Virtualmin Replication in Cloudmin 8.4 Pro (with Virtualmin 5.0 GPL and Webmin 1.782 GPL) has trouble recognizing when domains have been deleted. From an e-mail notification I received this morning:

Creating temporary directories .. .. done

Backing up 104 virtual servers on source system .. .. backup failed : Virtual server somedomain.com does not exist

In this particular case, somedomain.com was removed from Worker1, and replication to Worker2 should have simply deleted the offending domain, as the "Delete domains no longer on source" box was checked. Somedomain.com still shows up in the "Virtual Domains to Replicate" list (on the right) though, which is a bit strange, as that list doesn't appear to update properly. When I move it to the exception list on the left, it disappears. Also, we've been having some trouble with simply checking "All on source system" and expecting it to work, and we've got about a dozen domains that end up in the exception list (on the left), or replication fails for various reasons. This is out of a list of a whopping 120 domains, and I'd be scared of what would happen (or how much time I'd be dedicating to fixing replication every day) if we had thousands of domains, and domains were constantly being added and deleted.

Status: 
Needs work

Comments

Title: "Virtual server somedomain.com does not exist" when replicating Virtualmi domains. » "Virtual server somedomain.com does not exist" when replicating Virtualmin domains.

Ok, it looks like the bug here is that when you select specific domains to sync, those that have been removed from the source system aren't automatically excluded from the list .This will be fixed in the next Cloudmin release.

Status: Active » Fixed

Hi Jamie,

I can confirm this hasn't been fixed as I am currently experiencing the issue. Please revert with a solution.

@itinfra - which Cloudmin version are you running there?

Hi JamieCameron,

My cloudmin version is 9.3 Pro.

Regards

itinfra - this is definitely fixed. Are you sure you're seeing the exact same issue?

I can confirm this behavior. On a Debian 9 Webmin version 1.883 Usermin version 1.741 Virtualmin version 6.03 Cloudmin version 9.3 Pro

And I've ohter freaky issues with the replication.

I've 3 servers (web0 web1 web2) on web0 is cloudmin here I want to create and edit the domains ( databases are stored on a remote cluster, users goes to a remote ldap cluster and /home is a remote nfs cluster = his all works). I want web1 and web2 to act as frontend servers for www, mail, dns and so on. My plan is to replicate the domains from web0 to web1 and web2. All three are connected to ldap, mysql and nfs. When I replicate a domain (with out: Home directory, BIND DNS domain, MySQL database) I got this: Failed to restore on web1.xx.local : Checking for missing features .. .. all features in backup are supported Checking for errors in backup .. .. no errors found Starting restore.. Extracting backup archive files .. .. done Re-creating virtual server domain2.com .. .. a clash was detected : A unix user named domain2 already exists - try selecting a different administration username Restore failed!

Failed to restore on web2.xx.local : Checking for missing features .. .. all features in backup are supported Checking for errors in backup .. .. no errors found Starting restore.. Extracting backup archive files .. .. done Re-creating virtual server domain2.com .. .. a clash was detected : A unix user named domain2 already exists - try selecting a different administration username Restore failed!

So I thought - OK I don't need to have ldap on the targets for virtualmin ( I want to use it for other applikations i.e. a user can log in a quarantain or something ). I deploy two new servers (web3 and web4) with connections to the remote mysql and nfs. And replicate: Failed to restore on web3.xx.local : Checking for missing features .. .. all features in backup are supported Checking for errors in backup .. .. no errors found Starting restore.. Extracting backup archive files .. .. done Re-creating virtual server domain2.com .. Error: No LDAP client configuration file was found on your system, so the LDAP server must be set on the Module Config page X-Frame-Options: SAMEORIGIN Content-Security-Policy: script-src 'self' 'unsafe-inline' 'unsafe-eval'; frame-src 'self'; child-src 'self' Content-type: text/html; Charset=UTF

Edit: On an other test replicate (that I can't recreate) I could replicate the domains and see the useres in the /etc files ( passwd, shadow and so on) . Than I delete th domains on web0 and replicate with the option "Delete domains no longer on source" I get an error:

.. deletion on web3.xx.local of 2 domains failed : Error: No LDAP client configuration file was found on your system, so the LDAP server must be set on the Module Config page Error ----- No LDAP client configuration file was found on your system, so the LDAP server must be set on the Module Config page ----- Deleting virtual server domain.com

So are these servers are sharing users and groups via a common LDAP server? It looks like they are (based on the error about the user already existing), but Virtualmin doesn't think so.

If you run virtualmin check-config on one of the systems as root via SSH, does it say anything about LDAP being used?

On logini at the server it self: web1 login: root Password: LDAP Password:

and virtualmin check-config:

....

LDAP user and group management is properly configured.

....

On all systems, is the LDAP client configured to use the same IP or hostname for the LDAP server?

Any chance we could get access to your system to see what's going wrong inside Cloudmin/Virtualmin here?

Sure. It's a testing environment. What do you prefere? SSH Web or some like TeamViewer? I'll set this up on monday.

Any news on this Jamie?

OK, looking into this there are some deeper Virtualmin bugs that need to be fixed to get this working, sorry. I will update this ticket with progress..

OK. I close the SSH port, if you need it again give me a sign.

This will be properly fixed in the next release.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Is this fixed ? I tried it with on luck.

I forgot to update this - the latest Virtualmin release and the upcoming Cloudmin release should fix these issues.

Any news on this? Not Work on Cloudmin 9.4 virtualmin 6.05

Releases that include fixes for this issue are out already - make sure you upgrade to the latest Virtualmin and Cloudmin.

On source: Webmin version 1.902 Usermin version 1.751 Virtualmin version 6.05
Cloudmin version 9.4 Pro

on destination: Webmin version 1.902 Usermin version 1.751 Virtualmin version 6.05

should be the latest. Am I wrong?

Those are all the latest versions..

Any chance we can login to your system to see what's going wrong?

No problem I send IP, user, pass and root pass to your mail. please notify me if you need any further information or you are ready that I can deactivate SSH.

Thanks, I was able to login! But which specific domain were you trying to replication, and to which destination system?

The final plan ist to replicate all domains on host000 (cloudmin) without (BIND, MySQL and home-directory) to host001 and host002 .

To trigger the error, what action are you taking in virtualmin or cloudmin exactly? If it's something done via the UI, can I also login to it? I can't access port 10000 from my IP address 67.174.243.254

Port is open for you. What I do: "Cloudmin" > "Virtualmin Settings" > "Virtual Server Replication" > open the existing Job > "Replicate Now"

Jamie? Are you on this?

Hey Jamie,

have you got any news for me? I've to go to production soon.

Patch on both destination systems and reboot them don't work.

Re-creating virtual server ddd.de .. .. a clash was detected : A unix user named ddd already exists - try selecting a different administration username Restore failed!

Unfortunately not working both patches applied on both detination systems and even on the source system.

Any new ideas for this ?

Jamie do you need access to the destination system?

Any new findings ?

Yes, access to the destination system would be really useful in resolving this.

You've got a mail with all informations.

Ok, I'm in now .. taking a look.

I think I see the issue - on your remote systems, you need to go to the Virtualmin Configuration page, and set the "Store users and groups" option to "LDAP". Then run a config re-check.

well! On one it was set to LDAP but the re-check do the trick !!! THANKS.

now it says : "a clash was detected : The DNS domain ddd.de is already hosted by your DNS server Restore failed!" That's right because of the destinations are slave dns eervers. But i select under the replication : " Advanced replication settings" > "Virtualmin features to replicate" > "All except selected features .. " > "Home directory" + "BIND DNS domain" + "MySQL databases" .... So my thougt is BIND will not be replicated am I wrong?

It may be better to not have the remote systems setup as DNS slaves, and instead use the replication to also make them masters. That way they will be able to serve even if the primary is down for an extended period.