SEARCH
TOOLBOX
LANGUAGES
VMWareTrainingNotes2008

VMWareTrainingNotes2008

From GarrettHoneycutt

Jump to: navigation, search

Contents

Contact info

alec taylor
alec@ivoxy.com
206.755.5552
training.ivoxy.com

Tuesday

vmotion + drs

  • bump nfs mount count to 32 (max)
  • dedicated network interface for vmotion traffic - do not use the nfs (storage) network - best practice


DRS - every 5 minutes take a sample and figure out where to place load

HA (not really)

  • can lose an esx server or network connectivity (just pings gateway on a SINGLE interface) and after a predetermined amount of time, restart those virtual machines on a different host this is a crash state, not a vmotion xfer - just like pulling the power on the host and turning it back online somewhere else
  • after 2.5 no more split brain, it just lets the vm's run with no network
  • being supplanted by fault tolerance manager in VI4 for single vcpu vm's - like a vmotion that never ends (eats up capacity)

VMFS

  • bad locking
  • nfs is faster
  • storage io queueing is inefficient

Virtual Centre

  • Needs 32bit system
  • recommended as win 2003 server
  • can vmotion itself
  • can be ran as a virtual machine

Virtual Infrastructure Client

  • windows .net binary
  • linux/mac coming soon

virtual smp

  • if the physical box has 4 cpu's i could provision a 1x, 2x or 4x vcpu vm
  • dont provision a vm with more vcpu's than physical cpu's
    • if you have 2 dual core cpu's you have 4 cores, but you shouldn't have a vm with more than 2 vcpu's, because data transfer between the physical cpu's on that bus would be horrendous

update manager

  • used to patch esx hosts
  • inventory linux (limited) but no patching
  • windows patch mgmt works


vmware consolidated backup (hot mess)

  • frankenstein product to allow windows node to mount vmfs volumes and proxy off to some other backup target - ie: copy entire vm's over the network
  • DO NOT USE THIS
  • does not scale
  • could just do zfs snapshots of the NFS mount, or some other snapshot software (netapp)

consolidation mgr

  • gives you some capacity planning information, so when you move phys boxes to vm's you get an idea of what you need on your esx host

selecting esx hardare

questions to ask

  • how many
  • cpu type
  • memory qty
  • fault tolerent capacity
  • io ports/boards

recommendations

network
  • intel nic's
  • avoid broadcom
  • esx supports toe/tos (tcp offload engine)
disks
  • raid in hardware
  • esx does not support software mirrors
fiberchannel
  • qlogic or lsi
iscsi
  • qlogic 4000 series
Sizing

4:1 vcpu:phys cpu is a good place to start 2:1 virtual mem:phys mem

max
  • 32 cpu cores / .5 TB RAM - largest available - but not a good idea
  • vmware is only going to vmotion 2 vm's at a time
  • contention in system bus for a node that is too large
  • using 4 or 5 DL380's would be a better idea


# of esx hosts
  • minimum 3
  • take 10% off the top of RAM
io board minimum
  • 4x gigE
  • 2x 10gigE
  • 4x FC

networking esx hosts

channelization between hosts - uses etherchannel in static mode

  • 4Gb channel between hosts if you use the 4x gigE
  • esx does NOT support LACP (802.3ad)
  • src:dst mac load balancing (per session, not per packet) is used to determine which link in the channel to use - return traffic is assymetric (can be sent using any channel) though it uses ==src:dst mac==, so it will get set to use one of the links and be bound to it
  • src:dst ip is supported and is what you want if you are doing any routing on any end of the switch, else the mac is translated to that of the router and all traffic through it will get stuck on one link

esx networking

  • does not do routing
  • it does do 802.1q tagging
  • layer 3 traffic always leaves esx host


  • get intel 10/100/1000 VT card (specifically VT) - uses e1000 driver
    • dont use onboard NIC's
    • dont use different NIC's
    • onboard will share interrupts with USB or other devices - no good!
    • disable USB if available
    • use onboards for NFS, etc..
  • use an allowed vlan list if we use 802.1q so that you could only hop to related vlans and not to every vlan
  • manually set speed 1000 and full duplex

pNIC

phys nic in the esx host

vmnic

name for each phys NIC in the esx host vmnic0 - vmnic31

vswif

name given to the service console's virtual eth interfaces; seen in the service console in the output of 'ifconfig -a'

vswitch

virtual switch

port group

  • similar to vlan, a port group can also consist of IP config info,in the case of service console and vmkernel port groups. a port group is associated with one vSwitch

beacon probing

  • only used on vswitches with more than 2 associated pNIC's
  • more authoritative way than just monitoring link status. sends out beacons on the NIC's and if a minority of peers does not see a link, it will disable it
  • put iscsi on dedicated vlan
  • put nfs on dedicated vlan
  • build 1 vswitch if possible and use link aggregation
  • dont use standby nic's .. use link aggregation instead
  • can enable port fast/turn of STP (spanning tree protocol) for switches connected to vswitch as it will not cause a STP loop


Storage

  • RDM - raw disk management - allows vm to talk directly to the HBA for performance

VMFS

  • single IO queue

Locking

  • SCSI reservations - whole LUN gets locked for every 16MB (read and write)
    • this sucks
  • tailing a file creates locks
  • writing to a log file creates locks
    • this really sucks
  • view it as a glorified tape drive
  • metadata locks LUN, too (file size changes -> snapshots)


  • more blocks on a volume, the higher chance of reservations and locking
    • you want a ton of small vmfs LUNS instead of one large one
    • hitachi told someone in portland not to create a LUN larger than 108GB
  • expect problems if you run more than 12 VM's on it

Wednesday

VC (virtual centre)

Database needed

SQL Express

  • < 4GB
  • no support given from microsoft

Oracle

  • Oracle 10g or newer
  • recommended if you think you will go over 4GB, which is likely with extended performance stats, which we want

Virtual or Physical?

  • virtual with only a few exceptions
  • give it a vcpu and 1GB of RAM for every 12 hosts

Get Checklist


Order of Installation

  1. copy license file somewhere local
  2. install license server (uses tcp/27000)
  3. viclient
  4. vcserver (.exe, not the .msi)
    • Don't turn on Tomcat stuff - this was pre API and was a secure web based API via XML

Tweaks

Administration -> VirtualCenter Management Server Configuration

Statistics
  • change statistics level to 3 (maybe not on the daily per year though)
  • would probably not use level 4 unless directed by VMWare Support
  • if the number is greater than 4GB, you better not be using the default SQL Express, which only supports up to 4GB, or it will blow up
Mail
  • almost useless
  • do NOT rely upon
Timeout Settings
  • Normal Operations - bump up to 60 seconds
  • Long Operations - bump down to 60 minutes
Logging Options
  • should not need to change from ==Info== unless directed by VMWare Support
Database
  • If you have multiple people using VC, you might want to bump up the maximum #

Hosts & Clusters

Alarms
  • Disable ALL alarms, EXCEPT Host connection state
Permissions
  • Folders are useful places to put VM's
  • assign perms to the folder
Resource Pools
  • Always use them!
  • Contention for CPU cycles is the problem
    • avg -20% - reservation
    • avg +20% - limit
    • watch the queue, not the workload
    • dual or quad vcpu vm's have to be scheduled at the same time, so the hypervisor has to schedule dual or quad physical cores at that time
    • use single vcpu machines whenever possible!
  • this is where we dial-in oversubscription values
  • know max/min cpu & ram requirements (formula)