The first thing any Network Administrator should be doing when they enter the office each morning is running through a list of maintenance checks on all the servers and network infrastructure that they are responsible for to check that everything is operational and working correctly.

Now I say “should”, because we all know the real world situation it is likely that you have a list of existing problems to work on, emails to respond to and phone calls to answer before you spend time on any “maintenance” procedures.

I am here to tell you that if you start doing this regular maintenance, then you will end up with more time to work on other projects as you will be able to catch a lot of problems in the early stages before the develop into serious issues.

Ideally, you can run these checks before any other users get into the office and start to use the network, this is so you can remedy or start working on any issues that you find before they have a chance to affect any of the users of the network.

Now this might require a bit of a paradigm shift for many network administrators who are quite content to sit at their desk and wait for the problems to start having a detrimental effect on users before any action is taken to deal with them. But by proactively taking care of challenges before they have any impact on users you are looking after your network in a more skill full and efficient manner.

This might be a hard thing to implement immediately if you are already stuck in your ways but it is something that you should be working towards as it will inevitably make your life easier as you will have less urgent and panicked users to deal with.

I have written the below system with a Microsoft server and network in mind, but many of the same things are applicable for Linux and Mac, and if you are familiar with those systems then you will be able to work out what part relates to Linux and Mac. At the very least you will be able to take the underlying principles of running a proactively maintained and efficient network and apply them to whatever networking environment you are in.

Also, some of these tasks can obviously be automated; some commercial software can also be used to do this, and that is fine, if it makes the process more efficient then I encourage you to use them. Just make sure then that one of your daily checks is to make sure that whatever software you are using to automate these tasks is running correctly and reporting accurately. Don’t let automation of any work be an excuse for you not to pay attention to the expected outcomes of the task that is being automated!

Now on with the list, I have put these in no particular order of importance, as I will leave that up to you depending on your individual situations.

System Error and Warning Logs

Event Logs, Error Logs, and Warning Logs offer a wealth of information on any computer and network that will mostly go unnoticed until a problem arises that is severe enough to require some actual investigative work.

But part of maintaining a well-tuned system is attending to these issues before they arise and become serious problems. These logs will be written to no matter what the state of the machine is, even on a freshly installed Windows workstation you are likely to get a few warning and error messages start to appear immediately, but that is not a reason to scoff or to ignore these as they offer valuable insights into the operating state of your computer that you can learn from.

Think of the time you spend viewing these logs on a daily basis as teaching yourself to recognise the patterns of a computer that are running well and then you will start to notice when something out of the ordinary beings to occur.

You will learn that some warning messages can be safely ignored, but at least you will know which ones these are rather than guessing. If a new warning message appears you can take the time to investigate it, and you will learn more about the system you are in control of.  You can learn what steps it will take to stop the Error and Warning messages from occurring and again you will be find out more about your system.

The additional knowledge you learn from viewing this vital system information on a daily basis will help you out in the long run by allowing you to run your network more confidently and efficiently. Consider the time you spend doing this every day as an investment in your long-term skills and the long-term health of your system.

Backup Software and Jobs

If you are running a network of any importance, then you will be running nightly backups of your servers and users sensitive data. This is a crucial process and one that cannot be left completely unmonitored. For many reasons a backup can fail or get stuck and this may be unavoidable, but what you can avoid is the same error occurring for multiple days.

To do this, you need to check that the backup job has completed successfully every morning, and if it has failed, then you need to investigate what caused it and what you need to do to stop it from occurring again. By looking into these errors not only, will you make sure that your backups are running correctly but you will again be increasing your knowledge of whatever backup solution you use.

If you find yourself in a situation where a successful backup has not run for a few days, and you are unable to fix the problem yourself, and then make sure you reach out and contact the support line of whatever solution you are using and get them to help you correct the problem. Backups are crucial, and that should be no surprise to you, so take the time every morning to make sure these are done correctly as one day you just might need to use them.

Services and Processes

One of the easiest things to check to make sure a system is operating correctly is if all the services that should be running on the system are doing so correctly. On a Windows system, you can access these by going to Start, Run, then typing in services.Msc and clicking OK. I then find it easy to sort by “Automatic” start-up type. Then you can quickly browse through and look for any services that were set to be running and if they have stopped or become disabled. This occasionally might happen out of the blue, but it can also be an indicator of another problem that may have triggered a service to stop.

Internet and Email

This is one of the most important things you can check that is up and running, because if they are not up and running, you will certainly hear about it from your users as soon as they start using their computers.  Now the actual process of checking these can fall under one of the other controls, for instance, if you are checking a remote server if the internet connection is down you will find out soon enough because you will not be able to connect to the server.

Another idea might be to check the speed of your net connection every morning, running a speed test against a standard website (obviously speedtest.net is a great one) gives you a baseline of what to expect and if something is running slower than normal at least you will have the data to go back to your ISP and complain. Again the main idea is to notice any of these issues before your users do and alert them to the fact that you are already working to fix them before they even have a chance to complain.

Anti-Virus Definitions

Anti-Virus and any security software will always play a little game of “catch up”. A window of opportunity will always exist from the time a virus is released to the time virus definitions become updated to deal with them. Therefore making sure all your systems are loaded with the latest Anti-Virus definitions is crucial for the software to be able to work at its best capacity. Depending on your vendor, the updates will be released at different times and schedules, but this is still something that requires a daily check.

You will need to cross check on the vendor’s website what the latest version they have released is and then check that this is the version that is installed on your workstations and servers. Hopefully, your chosen software solution does include some form of enterprise management to allow you to do this at a glance.

If you are going to pay for Anti-Virus software, then this is a crucial step to be taking to make sure you are getting what your money’s worth.

Hard Drive Space

Whatever server operating system you are running you will have problems if you run out of usable hard drive space. Checking this everyday will give you an idea of how much free space is available on every server and then if you get to work one morning, and this has changed considerably you know you can investigate further to find out what the cause is.

I have seen this happen many times; a user might decide to drop a tonne of photos on a network drive, or a piece of software might start to generate a log file over and over again. But whatever the reason, keeping a close eye on available HD space will allow you to react quickly if something like that does happen, rather than waiting for the phone to ring from a user to tell you about a problem.

CPU and Memory Usage

CPU usage and Memory usage is something that should take you a few seconds to check, but you should again check it for the same reasons as I have listed above. Most of the time there will not be a problem with this, but you need to get an idea of what your server ‘s memory usage is like at a reasonable level before you could even begin to notice what a problem looks like.

Any Third Party Software Error and Warning Logs

If you are running any business critical third party software on your server, the chances are that it too has some form of output or error logging capabilities then these too should be checked on a daily basis.

Hardware Error Logs

Logging onto your switches, firewalls and routers every morning and checking the error logs of those will also provide you with the same benefits that I have listed above. It will embed into your mind what these hardware devices are reporting on an average basis and allow you to notice any changes. This will also help increase your knowledge of the actual devices themselves.

Printer Queue’s

Check all printers shared on your network to make sure the queue is empty, that the printer is up and running and that users can connect to it. A quick glance is usually all it takes to check this is OK. A stalled or error in the print queue can cause havoc with jobs stacking up behind it, so by checking this every morning, you can make sure that the chance of a problem occurring with this is lowered, and the end users are unlikely to have the problem effect them.

This might require a change in your mindset …

The above-listed tasks are what I recommend you check at a bare minimum, but this will, of course, vary for your different situations and networks.  But I hope the biggest point you learn from this article is the mindset behind doing these checks every morning.

The most important factor is that by doing these check you are remaining competent and efficient in you role as a network administrator, you are addressing issues before they turn into problems, taking care of situations, you are giving yourself more knowledge of your networking environment.

You do not want to be waiting for your phone to ring for a user to tell you about a problem, you want to be the one giving them a call first, letting them know a problem exists and that you are already working on fixing it. This puts their mind at rest and will decrease the levels of stress in your workplace. You will be surprised how different people react and more understanding they are many problems and issues if you are the one to tell them and not the other way around.  Doing these checks gives you the opportunity every morning to be the person who discovers a problem and is working to fix it, rather than being the person who has to be told to fix it. If you are still unable to see the benefits of this, then put yourself in the shoes of a business owner and ask yourself who you would rather have employed?

You might at first see these additional tasks that need to be done every morning as an unnecessary burden, but in the long run, they will indeed pay off. It will make you more knowledgeable and competent in running your network; it will make sure your system experiences fewer issues which in turn will give you more time to work on more value-add tasks which can help the business  that the uses the network.

Now that I have finished with my thoughts, I would love to hear back from any readers on their opinions of this list and the mindset behind it, do you agree / disagree and do you have any additional things you would recommend to check? Please leave these in the comments below. Also please feel free to share this article on Facebook and Twitter.

Warmest Regards,
Sonny