Suggestion : on SUPER busy servers, clone access log file before searching it

1 post / 0 new

Topic locked

#1 Sat, 04/11/2015 - 04:36

EcchiOli

Suggestion : on SUPER busy servers, clone access log file before searching it

Hello !

This is a friendly suggestion, not a complaint or a plea for help :)

I noticed that on my busiest website, with many visitors, it is practically impossible to search the website's access log for a string of text within virtualmin.

My suggestion : that virtualmin, when either of those two conditions are met (volume of activity; time formerly spent searching data without having yet ended searching), stops searching the "live" current access log file, but makes a temporary copy, and searches that temporary copy.

More details, perhaps ? How I search access logs : virtualmin home > Logs and reports > Apache access log > "Only show lines with tex", enter string, hit Enter. After this, either (a) on "normal" sites the search goes on for a small time, between nothing noticeable and one or two seconds, and results are provided, or (b) on very busy sites, the search takes longer, to the point where it may be searching for more than 30 seconds without giving anything, still searching endlessly. I imagine case (b) comes when the access log is still being populated by new results, and again, and again, and again, and agaaaaaaain, even while virtualmin is still trying to parse its contents. Parsing a file being updated endlessly even as the searching goes, that must be difficult for the system.

I tested manually, grepping my busiest site's access log file, and indeed, it can take forever, with monitoring stats showing the hard disk activity peaks during the search.

On the other hand, I can still easily cp the access log file to a new file, and grepping that copied, non-live file, takes only a few of seconds before results are returned.

Hence my suggestion, if virtualmin manages to realize it's hopeless to try and parse a file that's being constantly updated, virtualmin would do a much better job at losing a first few seconds cloning the file, and then searching the cloned file...

Good day everyone !