VMWareTrainingNotes2008
From GarrettHoneycutt
Contents
|
Contact info
alec taylor alec@ivoxy.com 206.755.5552 training.ivoxy.com
Tuesday
vmotion + drs
- bump nfs mount count to 32 (max)
- dedicated network interface for vmotion traffic - do not use the nfs (storage) network - best practice
DRS - every 5 minutes take a sample and figure out where to place load
HA (not really)
- can lose an esx server or network connectivity (just pings gateway on a SINGLE interface) and after a predetermined amount of time, restart those virtual machines on a different host this is a crash state, not a vmotion xfer - just like pulling the power on the host and turning it back online somewhere else
- after 2.5 no more split brain, it just lets the vm's run with no network
- being supplanted by fault tolerance manager in VI4 for single vcpu vm's - like a vmotion that never ends (eats up capacity)
VMFS
- bad locking
- nfs is faster
- storage io queueing is inefficient
Virtual Centre
- Needs 32bit system
- recommended as win 2003 server
- can vmotion itself
- can be ran as a virtual machine
Virtual Infrastructure Client
- windows .net binary
- linux/mac coming soon
virtual smp
- if the physical box has 4 cpu's i could provision a 1x, 2x or 4x vcpu vm
- dont provision a vm with more vcpu's than physical cpu's
- if you have 2 dual core cpu's you have 4 cores, but you shouldn't have a vm with more than 2 vcpu's, because data transfer between the physical cpu's on that bus would be horrendous
update manager
- used to patch esx hosts
- inventory linux (limited) but no patching
- windows patch mgmt works
vmware consolidated backup (hot mess)
- frankenstein product to allow windows node to mount vmfs volumes and proxy off to some other backup target - ie: copy entire vm's over the network
- DO NOT USE THIS
- does not scale
- could just do zfs snapshots of the NFS mount, or some other snapshot software (netapp)
consolidation mgr
- gives you some capacity planning information, so when you move phys boxes to vm's you get an idea of what you need on your esx host
selecting esx hardare
questions to ask
- how many
- cpu type
- memory qty
- fault tolerent capacity
- io ports/boards
recommendations
network
- intel nic's
- avoid broadcom
- esx supports toe/tos (tcp offload engine)
disks
- raid in hardware
- esx does not support software mirrors
fiberchannel
- qlogic or lsi
iscsi
- qlogic 4000 series
Sizing
4:1 vcpu:phys cpu is a good place to start 2:1 virtual mem:phys mem
max
- 32 cpu cores / .5 TB RAM - largest available - but not a good idea
- vmware is only going to vmotion 2 vm's at a time
- contention in system bus for a node that is too large
- using 4 or 5 DL380's would be a better idea
# of esx hosts
- minimum 3
- take 10% off the top of RAM
io board minimum
- 4x gigE
- 2x 10gigE
- 4x FC
networking esx hosts
channelization between hosts - uses etherchannel in static mode
- 4Gb channel between hosts if you use the 4x gigE
- esx does NOT support LACP (802.3ad)
- src:dst mac load balancing (per session, not per packet) is used to determine which link in the channel to use - return traffic is assymetric (can be sent using any channel) though it uses ==src:dst mac==, so it will get set to use one of the links and be bound to it
- src:dst ip is supported and is what you want if you are doing any routing on any end of the switch, else the mac is translated to that of the router and all traffic through it will get stuck on one link
esx networking
- does not do routing
- it does do 802.1q tagging
- layer 3 traffic always leaves esx host
- get intel 10/100/1000 VT card (specifically VT) - uses e1000 driver
- dont use onboard NIC's
- dont use different NIC's
- onboard will share interrupts with USB or other devices - no good!
- disable USB if available
- use onboards for NFS, etc..
- use an allowed vlan list if we use 802.1q so that you could only hop to related vlans and not to every vlan
- manually set speed 1000 and full duplex
pNIC
phys nic in the esx host
vmnic
name for each phys NIC in the esx host vmnic0 - vmnic31
vswif
name given to the service console's virtual eth interfaces; seen in the service console in the output of 'ifconfig -a'
vswitch
virtual switch
port group
- similar to vlan, a port group can also consist of IP config info,in the case of service console and vmkernel port groups. a port group is associated with one vSwitch
beacon probing
- only used on vswitches with more than 2 associated pNIC's
- more authoritative way than just monitoring link status. sends out beacons on the NIC's and if a minority of peers does not see a link, it will disable it
- put iscsi on dedicated vlan
- put nfs on dedicated vlan
- build 1 vswitch if possible and use link aggregation
- dont use standby nic's .. use link aggregation instead
- can enable port fast/turn of STP (spanning tree protocol) for switches connected to vswitch as it will not cause a STP loop
Storage
- RDM - raw disk management - allows vm to talk directly to the HBA for performance
VMFS
- single IO queue
Locking
- SCSI reservations - whole LUN gets locked for every 16MB (read and write)
- this sucks
- tailing a file creates locks
- writing to a log file creates locks
- this really sucks
- view it as a glorified tape drive
- metadata locks LUN, too (file size changes -> snapshots)
- more blocks on a volume, the higher chance of reservations and locking
- you want a ton of small vmfs LUNS instead of one large one
- hitachi told someone in portland not to create a LUN larger than 108GB
- you want a ton of small vmfs LUNS instead of one large one
- expect problems if you run more than 12 VM's on it
Wednesday
VC (virtual centre)
Database needed
SQL Express
- < 4GB
- no support given from microsoft
Oracle
- Oracle 10g or newer
- recommended if you think you will go over 4GB, which is likely with extended performance stats, which we want
Virtual or Physical?
- virtual with only a few exceptions
- give it a vcpu and 1GB of RAM for every 12 hosts
Get Checklist
- available on training site - http://training.ivoxy.com
Order of Installation
- copy license file somewhere local
- install license server (uses tcp/27000)
- viclient
- vcserver (.exe, not the .msi)
- Don't turn on Tomcat stuff - this was pre API and was a secure web based API via XML
Tweaks
Administration -> VirtualCenter Management Server Configuration
Statistics
- change statistics level to 3 (maybe not on the daily per year though)
- would probably not use level 4 unless directed by VMWare Support
- if the number is greater than 4GB, you better not be using the default SQL Express, which only supports up to 4GB, or it will blow up
- almost useless
- do NOT rely upon
Timeout Settings
- Normal Operations - bump up to 60 seconds
- Long Operations - bump down to 60 minutes
Logging Options
- should not need to change from ==Info== unless directed by VMWare Support
Database
- If you have multiple people using VC, you might want to bump up the maximum #
Hosts & Clusters
Alarms
- Disable ALL alarms, EXCEPT Host connection state
Permissions
- Folders are useful places to put VM's
- assign perms to the folder
Resource Pools
- Always use them!
- Contention for CPU cycles is the problem
- avg -20% - reservation
- avg +20% - limit
- watch the queue, not the workload
- dual or quad vcpu vm's have to be scheduled at the same time, so the hypervisor has to schedule dual or quad physical cores at that time
- use single vcpu machines whenever possible!
- this is where we dial-in oversubscription values
- know max/min cpu & ram requirements (formula)