vSphere Troubleshooting Introduction
Before we begin, we need to start off with an introduction to a few things that will make life easier. We’ll start with a troubleshooting methodology and how to gather logs. After that, we’ll break this eBook into the following sections: Installation, Virtual Machines, Networking, Storage, vCenter/ESXi and Clustering.
ESXi and vSphere problems arise from many different places, but they generally fall into one of these categories:
- Hardware issues
- Resource contention
- Network attacks
- Software bugs
- Configuration problems
A typical troubleshooting process contains several tasks:
- Define the problem and gather information.
- Identify what is causing the problem.
- Fix the problem, implement a fix.
One of the first things you should try to do when experiencing a problem with a host, is try to reproduce the issue. If you can find a way to reproduce it, you have a great way to validate that the issue is resolved when you do fix it. It can be helpful as well to take a benchmark of your systems before they are implemented into a production environment. If you know HOW they should be running, it’s easier to pinpoint a problem.
You should decide if it’s best to work from a “Top Down” or “Bottom Up” approach to determine the root cause. Guest OS Level issues typically cause a large amount of problems. Let’s face it, some of the applications we use are not perfect. They get the job done but they utilize a lot of memory doing it.
In terms of virtual machine level issues, is it possible that you could have a limit or share value that’s misconfigured?
At the ESXi Host Level, you could need additional resources. It’s hard to believe sometimes, but you might need another host to help with load!
Once you have identified the root cause, you should assess the impact of the problem on your day to day operations. When and what type of fix should you implement? A short-term one or a long-term solution? Assess the impact of your solution on daily operations.
- Short-term solution: Implement a quick workaround.
- Long-term solution: Reconfiguration of a virtual machine or host.
Next in this series: vSphere Troubleshooting Series: Part 2 – vSphere Troubleshooting Tools