Showing posts with label SSD. Show all posts
Showing posts with label SSD. Show all posts

Tuesday, January 24, 2017

Learning about vSphere Flash Read Cache

I'm looking at vSphere Flash Read Cache in case that Pernix FVP does not release an update for vSphere 6.5 (after being bought by Nutanix). Using vFRC is a bit different right off the bat, since it doesn't do write acceleration, but since I already have the required vSphere licensing and hardware, there is no cost to enable.

The biggest problems I see so far are:

1) not a lot of reported users, at least that I could find, although it's been kept as a feature by VMware since it was announced so there has to be quite a few. However, I didn't find lots of operational blogs, just feature announcement types.
2) more rigid implementation steps compared to Pernix FVP, which takes some reading to figure out

Biggest differences with Pernix FVP apart from the obvious:



Known KB’s
There are two known issues, and they are easily avoidable as patches were released already, so just make sure you are running latest before enabling

https://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=2114498&sliceId=1&docTypeID=DT_KB_1_1&dialogID=381661788&stateId=0%200%20381669816
https://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=2072392&sliceId=1&docTypeID=DT_KB_1_1&dialogID=381647699&stateId=0%200%20381651933 

Documentation
http://pubs.vmware.com/vsphere-65/index.jsp#com.vmware.vsphere.storage.doc/GUID-07ADB946-2337-4642-B660-34212F237E71.html 
http://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/vmware-vsphere-flash-read-cache-faq.pdf << particularly useful
http://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/vmware-vfrc-performance-vsphere55-white-paper.pdf  << cool read

Blog posts

http://cormachogan.com/2014/02/14/a-closer-look-at-vsphere-flash-read-cache-vfrc/ 
http://nolabnoparty.com/en/vmware-vflash-read-cache-setup/ 
http://everything-virtual.com/vmware-study-guides/vcap-dca-study-guide/configure-and-manage-vsphere-flash-read-cache/ 
http://www.settlersoman.com/what-is-and-how-to-configure-vmware-vsphere-flash-read-cache-vfrc/ 
http://www.vladan.fr/vmware-vflash-read-cache-vfrc/ 

Monday, April 25, 2016

Working with the Dell sold Samsung NVMe XS1715 SSD 800GB

TL:DR - you will probably need to do a firmware downgrade and driver update to comply with the HCL. Don't trust the Dell installer or OME Linux updater!

We bought some Dell servers that have this NVMe drive. As far as I know, this is the only hot-plug NVMe drive they sell for the R730XD series, but I suspect this will change sometime in the future.

The installer originally used on this server didn't have the NVMe driver, so I wasn't being able to see the drive in my storage interface. To make sure the drive is working properly before installing the driver, you can check it shows up in the storage section of the iDRAC interface and you can use this command to see if this PCIe device shows up in ESXi.

~ # lspci -v |grep NVMe
0000:87:00.0 Non-Volatile memory controller Mass storage controller: Samsung, Inc. Express Flash NVMe XS1715 SSD 800GB

With this information, even without having it mapped to an vmhba yet, I can get the 4 hardware IDs and be able to check this device in the HCL

~ # vmkchdev -l |grep 0000:87:00.0
0000:87:00.0 144d:a820 1028:1f96 vmkernel


In my case, this is being installed with 5.5u3.


Firmware:

The firmware version for all ESXi releases is the same (IMP0DD3Q), and a quick google shows this is the initial release of the firmware from 2014. This server has already had an OME boot CD run, and I didn't see any updates to this device; but checking in iDRAC, it shows I am running a different release, IPM0ID3Q:


Checking the Dell page for that firmware release, it seems this is the 3rd release, and the VMware HCL has not caught up to these releases. The release I'm currently running is from June 2015 and improves the wear levelling algorithm, while the 2nd release improves management functions. Tsk tsk, VMware/Dell...

For now it seems I will have to figure out how to downgrade the firmware to comply with HCL. I checked the VSAN HCL as well, and there are currently no Samsung NVMe drives certified, but I was told by other vExperts they are in progress and will update once I have news on that front. Once I figure it out, I will update this section.


Driver:

Installing using the Dell image to this date (VMware-VMvisor-Installer-5.5.0.update03-3568722.x86_64-Dell_Customized-A05.iso) I was able to see the driver used with a "esxcli storage core adapter list"

vmhba7 nvme link-n/a pscsi.vmhba7 (0:135:0.0) Samsung Electronics Co Ltd Dell Express Flash NVMe XS1715 800GB PCIe SSD Controller
And we can check this driver's information with "vmkload_mod -s nvme"

~ # vmkload_mod -s nvme
vmkload_mod module information
 input file: /usr/lib/vmware/vmkmod/nvme
 Version: 1.2.0.27-4vmw.550.0.0.1331820
 License: BSD
 Required name-spaces:
  com.vmware.vmkapi#v2_2_0_0
 Parameters:
  nvme_compl_worlds_num: int
    Total number of NVMe completion worlds/queues.
  nvme_dbg: int
    Driver NVME_DEBUG print level
  io_timeout: int
    IO timeout second for internal checker
  max_scsi_unmap_requests: int
    Maximum number of scsi unmap requests supported
  max_namespaces: int
    Maximum number of namespaces supported
  io_cpl_queue_size: int
    NVMe number of IO completion queue entries
  io_sub_queue_size: int
    NVMe number of IO submission queue entries
  admin_cpl_queue_size: int
    NVMe number of Admin completion queue entries
  admin_sub_queue_size: int
    NVMe number of Admin submission queue entries.
  nvme_log_level: int
    Log level.
        1 - error
        2 - warning
        3 - info (default)
        4 - verbose

        5 - debug

As seen in the HCL in the beginning of this post, the only driver version on the HCL is 1.0e.0.30-1vmw so I will also need to download and install a custom driver for this card. At least in vSphere versions 6.0+ it lists the VMware inbox driver instead.

The driver from the VMware site should be downloaded, saved to a datastore or /tmp, and then the commands to uninstall and install should be run. I'll paste the commands here soon.

Wednesday, December 9, 2015

Testing SSD-enabled VMware products in the nested home lab

So here's a little project for December.

I use 3 solutions at work that rely on SSDs, or offer them as an option. The solutions are:

PernixData FVP
Stormagic SVSAN
VMware Virtual SAN (VSAN)

I want to run them on my home lab. However, my home lab is one 32gb desktop. I do have a 500GB SSD and normal spinning disk on it, so at least there is SSD performance available, but I may need to get creative.

So, my little experiment will be to created two or three "SSD enabled" hosts and test if the products will install and run. I leave here the links for requesting the product trials:

https://get.pernixdata.com/FVPTrial
http://www.stormagic.com/60-day-free-trial/
https://www.vmware.com/products/virtual-san/vsan-hol.html

I will blog my experience with each one, including any tricks I had to use to make the nested hosts work properly. I obviously can't really run any production on these; this basically is a lab for me to test installation, upgrades and what happens if I switch from vsphere 5.x to v6. I obviously will google and tweet questions as needed.

Thursday, October 15, 2015

Upgrading Intel SSD firmware on Dell servers

I have retail Intel DC S3710 SSDs installed on Dell servers and and I wanted to do firmware upgrades on them. I went to the product page and found there is an Intel SSD Data Center Tool.

Links as of 10/15/2015 are

download: https://downloadcenter.intel.com/download/23931/Intel-Solid-State-Drive-Data-Center-Tool
manual: https://downloadmirror.intel.com/23931/eng/Intel_SSD_Data_Center_Tool_2_3_x_User_Guide_331961-005.pdf

This is a command line tool. Install the x64 or x32 version. Then run a command prompt as administrator and navigate to c:\isdct

Normally you would run this and expect to get information

C:\isdct>isdct.exe show -intelssd
No results

You will not get information when running it from a Dell server. Dell servers normally have a PERC RAID HBA card between the OS and the drives; PERC is based on the LSI MegaRAID cards.

In the manual, there is this disclaimer:

The Intel SSD DCT does not support SSD Data Center SATA drives behind HBAs (exception: LSI Mega RAID adapters).

So i'm thinking this might work.

The command to enable the LSI adapter is

c:\isdct>isdct.exe set -system EnableLSIAdapter = true
Set EnableLSIAdapter successful.

Results:
1. Dell R610 with a 6/i card

I ran this on a Dell R610 with a 6/i card and then ran the isdct show -intelssd command. After 2 or so minutes, it crashed the server. This can be found in the event logs after rebooting:

Controller event log: Fatal firmware error: Driver detected possible FW hang, halting FW. 

:  Controller 0 (PERC 6/i Integrated)


I couldn't get any further, and had to upgrade the drives on another server and bring them back to this.


2. R710 with a H700 card

I ran it on a R710 with a H700 card and it did complete (although I did get errors in the Wndows System log with any operation)

C:\isdct>isdct show -intelssd
- IntelSSD BTHV503605A9400NGN -
DeviceStatus: Healthy
Firmware: G2010110
FirmwareUpdateAvailable: G2010140
ModelNumber: INTEL SSDSC2BA400G4
ProductFamily: Intel SSD DC S3710 Series
SerialNumber: BTHV503605A9400NGN
Index: 0
DevicePath: LSI4
Bootloader: Property does not exist.

- IntelSSD BTHV50360580400NGN -
DeviceStatus: Healthy
Firmware: G2010110
FirmwareUpdateAvailable: G2010140
ModelNumber: INTEL SSDSC2BA400G4
ProductFamily: Intel SSD DC S3710 Series
SerialNumber: BTHV50360580400NGN
Index: 1
DevicePath: LSI5
Bootloader: Property does not exist.


To run the firmware upgrade you use load and use the index number that it had reported.

C:\isdct>isdct load -intelssd 0
WARNING! You have selected to update the drives firmware!
Proceed with the update? (Y|N): Y
Updating firmware...
Firmware update successful. Please reboot the system.


C:\isdct>

Please note that I lost access to the server after the firmware had completed!

I power cycled the server (server off, wait 5 seconds, turn on) and did the next SSD (after which the server again failed and had to be power cycled). After the reboot I got a good status:

C:\isdct>isdct.exe show -intelssd
- IntelSSD BTHV503605A9400NGN -
DeviceStatus: Healthy
Firmware: G2010140
FirmwareUpdateAvailable: The selected Intel SSD contains current firmware as of
this tool release.
ModelNumber: INTEL SSDSC2BA400G4
ProductFamily: Intel SSD DC S3710 Series
SerialNumber: BTHV503605A9400NGN
Index: 0
DevicePath: LSI4
Bootloader: Property does not exist.

- IntelSSD BTHV50360580400NGN -
DeviceStatus: Healthy
Firmware: G2010140
FirmwareUpdateAvailable: The selected Intel SSD contains current firmware as of
this tool release.
ModelNumber: INTEL SSDSC2BA400G4
ProductFamily: Intel SSD DC S3710 Series
SerialNumber: BTHV50360580400NGN
Index: 1
DevicePath: LSI5

Bootloader: Property does not exist.

Running any isdct operation still showed several Sense and drive not certified errors on the Windows System log though, but they stopped after that.


3. R620 with a PERC H710P Mini

I ran it o a R620 with a PERC H710P Mini

C:\isdct>isdct.exe show -intelssd
- IntelSSD BTHV503605AM400NGN -
DeviceStatus: Healthy
Firmware: G2010110
FirmwareUpdateAvailable: G2010140
ModelNumber: INTEL SSDSC2BA400G4
ProductFamily: Intel SSD DC S3710 Series
SerialNumber: BTHV503605AM400NGN
Index: 0
DevicePath: LSI6
Bootloader: Property does not exist.

- IntelSSD BTHV503602BZ400NGN -
DeviceStatus: Healthy
Firmware: G2010110
FirmwareUpdateAvailable: G2010140
ModelNumber: INTEL SSDSC2BA400G4
ProductFamily: Intel SSD DC S3710 Series
SerialNumber: BTHV503602BZ400NGN
Index: 1
DevicePath: LSI7
Bootloader: Property does not exist.

C:\isdct>isdct load -intelssd 0
WARNING! You have selected to update the drives firmware!
Proceed with the update? (Y|N): y
Updating firmware...
Firmware update successful. Please reboot the system.

C:\isdct>isdct load -intelssd 1
WARNING! You have selected to update the drives firmware!
Proceed with the update? (Y|N): y
Updating firmware...
Firmware update successful. Please reboot the system.

C:\isdct>

This one did not fail on me after the first disk - I was able to run the two firmware upgrades and shut down the server cleanly. I did do power cycle (server off, wait 5 seconds, then turn on) after the firmware upgrades.

4. R610 with a H700 card

I also did this on an R610 with a H700 and the OS failed as well once a SSD was flashed, so it seems to be a problem with the H700 card specifically.