Tag Archive: Xangati


Ok so I’ll take my tongue out of my cheek  – I have never heard Xangati’s summer refresh of their performance monitoring dashboard called by its Three Letter Acronym  (TLA) before, but I was lucky enough to be given a preview of the Xangati Management Dashboard (XMD) and to be shown some of the new ways in which it can gather information and metrics relevant to a Virtualised Desktop deployment.

When I first came across the product about 12 months ago, it’s main strength was in the networking information it could surface to a VI admin – by use of small appliance VM’s sitting on a promiscuous port group on a Virtual switch it was able to analyse net flow data going in and out of a host – when this was aligned with metrics coming form Virtual centre. The products “TiVo” like recording interface was able to capture what was happening to an infrastructure either side of an incident , be it a predefined threshold , or an automatically derived one – where a workload was suitably predictable for a given length of time , the application was able to create profiles for a number of metrics and record behavior outside that particular profile. As with other products that attempt to do dynamic thresholding , the problem comes in the form of an environment which is not subject to a predictable workload where is possible to miss an alert while the software is still “learning” – it also assumes that you have a good baseline to start with. If you have a legacy issue that becomes incorporated into that profile , then it can be difficult to troubleshoot. To this extent I’m glad that more traditional static thresholds are still able to be put in place. When monitoring environments with vSphere version 5 , there is no more requirement for the network flow appliances – the netflow data is provided directly to the main XDM appliance via a vCenter API. With a Single API connection , the application is focussed on much more than just the network data – allowing a VMware admin to see a wide view of the infrastructure from the Windows process to the VMware Datastore.

 

What interested me about the briefing was the level of attention being paid to VDI – I think Xangati is quite unique in terms of their VDI Monitoring dashboard and the latest release reinforces that. In addition to metrics that you would expect around a given virtual machine in terms of its resource consumption , Xangati have partnered with the developers of the PCoIP protocol , Terradici in order to be able to provide enhanced metrics at the protocol layer of a VDI connection. This offers a welcome alternative to the current method of having to utilise log analysers like Splunk.

 

VDI users are in my opinion much more sensitive to temporary performance glitches than consumers of a hosted web service. If a website is a little slow for a few seconds , people might look at their network connection or any number of alternative issues , but for a VDI consumer , that desktop is their “world” and would affect every application they use. Thus when it runs poorly they are much more liable to escalate than the aforementioned web service consumers. Use of the XMD within a VDI environment allows an administrator to trouble shoot those kinds of issues ( such as storage latency or badly configured AV policy causing excessive IO ) by examining the interaction between all of the components of the VDI infrastructure , even if the problem occurred beyond the rollup frequency of a more conventional monitoring product. This what Xangati views as one of its strengths , while I don’t think it is a product I would use for day to day monitoring of an environment – there is a lot of data onscreen and without profiles or adequate threshold tuning it would require more interaction than most “wallboard” monitoring solutions, I can see if being deployed as a tool for deeper troubleshooting. There is a facility which would allow an end user to trigger a “recording” of the relevant metrics to their desktop while they are experiencing a problem ( although if the problem is intermittent network connectivity , this could prove interesting ! )

 

As a tool for monitoring VDI environments it certainly has some traction , notably being used to monitor the lab environments for last years and this years VMworld cloud based lab setups , as well as some good sized enterprise customers. With this success I’m a little surprised at the last part of the launch, “No Brainer” pricing… In a market where prices seem to be on a more upward trend , Xangati have applied a substantial discount to theirs – with pricing for the VDI dashboard starting at $10 per desktop for up to 1000 desktops. I’m told there is an additional fee for environments larger than that. I’m no analyst but I’d love to explore the rationale behind this.. Was the product seen as too expensive ( although as with many things , the list price and the price a keen customer pays can often be pretty different – is this an attempt to make software pricing a little more “no nonsense ? “ I guess time will tell !

image

For more information on the XMD and to download a free version of the new product , good for a single host , check out http://Xangati.com

This is a blog post that I’ve had at the back of my mind for a good 6 months or so. The pieces of the puzzle have come together after the Gestalt IT Tech Field Day event in Boston. After spending the best part of a week with some very very clever virtualisation pro’s I think I’ve managed to marshal the ideas that have been trying to make the cerebral cortex to wordpress migration for some time !

Managing an environment , be it physical or virtual for capacity & performance requires tools that can provide you with a view along the timeline. Often the key difference between dedicated “capacity management” offerings and performance management tools is the very scale of that timeline.

clip_image002

Short Term : Performance & Availability

There we are looking at timings within a few seconds / minutes ( or less ) this is where a toolset is going to be focused for current performance on any particular metric , be it the response time to load a web application , Utilisation of a processor core or command operations rate on a disk array. The tools that are best placed to give us that information need to be capable of processing a large volume of data very quickly due to the requirement to pull in a given metric on a very frequent interval. The more frequently you can sample the data , the better quality output the tool can give. This can present a problem in large scale deployments due to a requirement that many tools have to write this data out to a table in a database – this potentially tethers the performance of a monitoring tool to the underlying storage available for that tools , which of course can be increased but sometimes at quite a significant cost. As a result you many want to scope the use of such tools only to the workloads that require that short term , high resolution monitoring. In a production environment with a known baseline workload , tools that use a dynamic threshold / profile for alerting on a metric can be very useful here ( for example Xangati or vCenter Operations ) If you don’t have a workload that can be suitably base lined ( and note that the baseline can vary on your business cycle , so may well take 12 months to establish ! ) then the dynamic thresholds are not of as much use.

Availability tools have less of a reliance on a high performance data layer as they are essentially storing a single bit of data on a given metric. This means the toolset can scale pretty well. The key part of availability monitoring is the visualisation and reporting layer. There is no point only displaying that data to a beautiful and elegant dashboard if no-one is there to see that dashboard ( and according to the Zen theory of network operations , would it change if there was no one there to watch it ! ) The data needs to be fed into a system that best allow an action to be made – even if it’s an SMS / Page to someone who is asleep. In this kind of case , having suitable thresholds are important – you don’t want to be setting fire alarms off for a blip in a system that does not affect the end service. Know the dependencies on the service and try to ensure that the root cause alert is the first one sent out. You do need to know that the router that affects 10,000 websites is out long before you have alerts for those individual websites.

Medium Term : Trending & Optimisation

Where the timeline goes beyond “what’s wrong now” , you can start to look at what’s going to go wrong soon. This is edge of the crystal ball stuff , where predictions are looking to be made in the order of days / weeks. Based on collected utilisation data in a given period , we can assess if we have sufficient capacity to be able to provide an acceptable service level in the near future. At this stage , adjustments can be made to the infrastructure in the form of resource balancing ( by storage or traditional load ) – tweaks can also be made to virtual machine configuration to “rightsize” an environment. By using these techniques it is possible to reclaim over allocated space and delay potential hardware expansions. This is especially valid where there may be a long lead time on a hardware order. The types of recommendations generated by the capacity optimisation components of VKernel , NetApp ( Akorri ) and Solarwinds products are great examples of rightsizing calculations.  As the environment scales up , not only are we looking for optimisations , but potential automated remediation ( within the bounds of a change controlled environment ) would save time and therefore money.

Long Term capacity analysis : When do we need to migrate Data centers ?

Trying to predict what is going to happen to an IT infrastructure in the long term is a little like trying to predict the weather in 5 years time , you know roughly what might happen but you don’t really know when. Taking a tangent away from the technology side of things , this is where the IT strategy comes in – knowing what applications are likely to come into the pipeline. Without this knowledge you can only guess how much capacity you will need in the long term. The process can be bidirectional though , with the information from a capacity management function being fed back into the wider picture for architectural strategy for example should a lack of physical space be discovered , this may combine with a strategy to refresh existing servers with blades. Larger Enterprises will often deploy dedicated capacity management software to do this ( for example Metron’s Athene product which will model capacity for not only the virtual but the physical environment )  Long term trending is a key part of a capacity management strategy but this will need to be blended with a solution to allow environmental modeling and what if scenarios. Within the virtual environment the scheduled modeling feature of VKernel’s vOperations Suite is possibly the best example of this that I’ve come across so far – all that is missing is an API to link to any particular enterprise architecture applications. When planning for growth not only must the growth of the application set be considered but the expansion in the management framework around it , including but not limited to backup and the short-medium term monitoring solutions.  Unless you are consuming your it infrastructure as a service , you will not be able to get away with a suite that only looks at the Virtual Piece of the puzzle – Power / Cooling & Available space need to be considered – look far enough into the future and you may want to look at some new premises !

We’re going to need a bigger house to fit the one pane of glass into…

“one pane of glass” – is a phrase I hear very often but not something I’ve really seen so far. Given the many facets of a management solution I have touched on above , that single pane of glass is going to need to display a lot ! So many metrics and visualisations to put together , you’d have a very cluttered single pane. Consolidating data from many systems into a mash-up portal is about the best that can occur , but yet there isn’t a single framework to date that can really tick all the boxes. Given the lack of a “savior” product you may feel disheartened , but have faith!. As the ecosystem begins to realise that no single vendor can give you everything and that an integrated management platform that can not only display consolidated data , but act as a databus to facilitate sharing between those discrete facets is very high on the enterprise wishlist , we may see something yet.

I’d like to leave you with some of the inspiration for this post – as seen on a recent “Demotivational Poster” –a quick reminder of perfection being in the eye of the beholder.

“No matter how good she looks, some other guy is sick and tired of putting up with her s***”

image

 

As the tweet above proves I’m about to out scoop Eric “Scoop” Sloof of ntpro.nl fame and would like to be the first to break the news on the innovative Pork Product Delivery system (PPDS)  from your favourite real time monitoring provider , Xangati.

 

In a recent briefing on the new VDI/VI Dashboards I was able to grab a screen shot as the present flicked to a preview screen that proves this to be the case.

 

image

 

Not only is Xangati able to provide role based dash board of real time data about your VI environment that reflect the real health issues within a system , but they are able to monitor the Saltiness Levels for Admins ( SLA’s ) and trigger off an Automated Bacon Delivery Service (ABDS) provided via a network of bacon resellers ( ButcherNet). This was already been successfully beta  tested at Tech Field Day. Turkey based Bacon Substitute (TBBS) is available for environments that don’t dig on swine.