Yossi Dahan [BizTalk]

Google
 

Friday, January 22, 2010

What I’ve learnt about BizTalk Hosts

Here’s something I learnt from Guru Venkataraman on my one of my visits to Redmond last year -

Like most seasoned BizTalk developers, I suspect, I usually follow the best practice around host planning nicely summarised by Marcel here.

What I did not think about (and nor did Marcel, it appears) is the actual queues behind these hosts, how they are used, and the impact on performance; in comes Guru’s wisdom-

Each host has a queue, implemented as a table in the message box database, so - loosely speaking – when a message is published to the message box, BizTalk determines which subscribers are interested in the message, and places the message (well – logically) in the queues of the relevant hosts.

Each hosts polls its own queue periodically (default is 500ms) to get any pending messages and acts on them – start a new instance or correlate to an existing instance, potentially rehydrating it.

Now imagine you have one host running two different processes and that two messages are received both of which should trigger a new instance of process A.

The host picks up the first message (for this discussion to be clear enough we’ll have to pretend it’s working on a single thread, although I suspect several aspect of this are) and passes it to a new instance of process A.

Imagine now that process A publishes a message intended for process B; as process B runs under the same host, the message gets into the very same queue in which we already have the second message queued for process A.

Now – process B cannot start until the host reads the pending message and starts the second instance of process A. it is only at this point that the message intended for process B is at the ‘front’ of the queue and can be read by the host and an instance of process B can be started. effectively the execution of process B has been delayed unnecessarily because there were messages queued for process A.

Now – of course BizTalk much of this is indeed multi threaded and so the problem is not that severe. BizTalk was clearly designed to host many processes under any single host and there’s definitely not a requirement (or a recommendation) to have a host per-process, that would be ridiculous, however -

Where you have a flow where process A often publishes messages that end up in another process (B, C or D to make some names), and you have a reasonable throughput, you would get better performance if the publisher of the messages (process A) is hosted separately from potential subscribers (B, C or D).

With the subscribers configured with a different host they can pick up their messages without being affected by other messages awaiting for process A, and vice versa – the same goes for any replies published by them.

Two things worth remembering in this context:

  1. When you ‘call’ an orchestration, as opposed to ‘start’-ing an orchestration, the host of the calling orchestration is used, regardless of what’s configured in the admin console
  2. Whilst I’ve specifically mentioned processes, the same considerations apply to send ports, and even receive locations, and when looking at the former it is worth remembering that dynamic send ports, where used, always use the default host for the adapter used.

Labels: ,

Wednesday, September 02, 2009

Creating messages from scratch - revisited briefly

A while back I’ve posted about the different ways to create messages in an orchestration, and later some performance comparison between them.

Mostly for fun I run a quick test on my newly installed laptop; I did not put nearly as much effort as I have previously, so don’t make out of these numbers too much, but I was amazed to see that all the results were running pretty much 10 times faster.

Now – it’s a new BizTalk (2009), new SQL server (2008), new operating system (Windows 7) and a new(-ish) laptop (Thinkpad T61), so there’s no way to know how much each component contributed to the improvement, but it is amazing how much difference can exist after just one year!

Well – not at all scientific, but I found it interesting anyway!

Labels: ,

Wednesday, November 05, 2008

Message Creation in BizTalk - solution uploaded

A few weeks ago I published this post about some experiments me and Randal Van Splunteren did around message creation.

Not surprisingly I was asked to post the solution we've used and so I have uploaded it here

 

Have fun! (let me know if anything's missing or unclear, it's been a while since I ran this...)

Labels: , ,

Saturday, October 11, 2008

Fun with Message Creation in BizTalk

Back in Match I posted this entry about creating messages "from scratch" in BizTalk.

The post started a bit of an online discussion and a slightly more intensive offline discussion about the various ways to create messages and the differences between them.

As part of that discussion, Randal van Splunteren and I have exchanged some emails and Randal took the time and effort to create a test solution to compare the performance characteristics of the various methods which I have helped validating.

Randal has been kind enough to let me summarise our findings in this blog (and it only took me 6 months...buy I have my excuses) so here it is -

The scenario we've used to test is as follows -

There is one main orchestration that takes in a ‘command’ message using a file receive location; in this command message you can define the method to create a message:

  • Map with Defaults (1)
  • Map with xsl (2)
  • Assignment with serialization (3)
  • Assignment with resource file (4)
  • Using undocumented API (5)

The first four options create messages according to the four methods I described in my blog post; the fifth one uses the CreateXmlInstance API suggested by Randal as a comment on my original post.

In the command message you can also set the number of messages that must be created;

Finally you can set if the method should use caching; we've implemented a very simple caching mechanism for the assignment and undocumented API methods (caching the generated instance in all three methods so it can be re-used in subsequent calls); for the map methods the caching parameter is ignored because BizTalk has its own caching for those methods.

When a particular test is finished the main orchestration writes out a ‘report’ message (again using file adapter) which contains the number of elapsed ticks the test took.

I've ran all the scenarios 5 times and averaged the results, between each test I have restarted the host to get as much like-for-like comparison as I could, so these numbers would not reflect true runtime performance of a live server but only the difference between the approaches; initially I ran all the tests creating 1 message at a time, here are the results:

msgs Map using defaults Map using xsl Assign using serialisation Assign using resource Assign using API
1

13,243,663

12,687,314

8,153,346

8,135,461

36,374,565

1

13,385,005

12,888,630

6,905,139

8,620,287

36,468,805

1

12,837,338

13,943,338

9,272,362

8,815,033

37,723,069

1

15,630,602

13,298,954

6,679,173

8,027,708

35,877,260

1

12,729,576

12,765,337

7,113,975

9,174,668

36,919,198

Avg

13,565,237

13,116,715

7,624,799

8,554,631

36,672,579

or to put it graphically - image

Then I ran all the tests again, this time creating 100 messages at a time -

msgs Map using defaults Map using xsl Assign using serialisation Assign using resource Assign using API
100

15,195,199

15,254,912

9,158,223

8,951,018

231,352,547

100

14,421,621

16,523,637

9,259,892

8,700,856

226,704,695

100

15,199,198

15,010,499

8,476,670

10,222,202

232,357,798

100

16,725,023

15,684,085

9,110,269

9,866,252

227,806,462

100

15,349,885

14,475,857

9,101,879

10,295,228

226,928,786

Avg

15,378,185

15,389,798

9,021,387

9,607,111

229,030,058

image

Last I ran the 3 non-mapper versions with the caching enabled -

# messages Assign using serialisation (cached) Assign using resource (Cached) Assign using API (cached)
100

9,696,044

9,478,015

41,350,100

100

8,288,120

10,087,574

37,410,620

100

9,156,289

10,473,718

36,493,118

100

8,715,621

10,001,671

40,628,198

100

8,289,295

9,951,817

37,919,237

Average

8,829,074

9,998,559

38,760,255

image

So, what I have spotted?

well, to start with, comparing my results with those Randal had I learnt that my laptop is much slower then his machine...(but you can't see that from the results, nor, I suspect, do you care...)

But seriously -

  • It is interesting to see how, with the exception of the API scenario, there is very little difference between the generation of 1 message and the generation of a 100.
  • It is quite obvious that the API call is much slower then the rest, but that does not surprise me considering the amount of work involved (getting the schema from the database, generating the instance off the XSD retrieved...)
  • For that reason, it is also quite obvious that this method was the most beneficial from the use of the cache (but was still significantly slower then the others) as the cache prevented the repeating access to the database and the xml generation.
  • On the same token, caching did not make a very significant difference in the other scenarios, but again- I wouldn't consider that surprising (as there's very little work involved)
  • And of course - it is clear that using assignment shape to create messages using either serialisation or a resource file is indeed the fastest way (serialisation being a little faster on my machine)

I hope you find this useful and again - many thanks to Randal for all his effort in helping me get this out.

Labels: , , ,

Wednesday, July 09, 2008

BizTalk Performance Optimisation Guide

Microsoft has announced the publication of the new "Performance Optiomisation Guide" written by Ewan Fairweather and Rob Steel.
I would expect this one to do wonders for some of us out there trying to give a little bit more "juice" to our BizTalk impelementation - grab it here.

Labels: , , ,

Thursday, April 24, 2008

Throttling in full action

Here's another one from the archives (=the list of things I have waiting to be blogged)

At some point we had a sudden peak in system load on our BizTalk processes and, as a result, our BizTalk solution that was running so nicely seem to have gotten "stuck".

In "stuck" I mean - we ended up with lots of processes in "Active" state, but they did not seem to be active at all; a closer inspection (of trace that should have been emitted) showed that although the instances status says "Active" they were all very passive indeed - nothing was executing on the server - close to 0% CPU and no trace whatsoever.

This is where you might expect me to describe the long hours we've spent investigating the issue, the sleepless nights and empty cartons of pizza... - but really what happened is that, not being able to afford any more down time, we called out premier support which turned out to be a great thing because the first thing they did (well, not literally, but anyway) was to ask us to check the state of the server using the MsgBoxViewer which in turn pointed out that we have simply "max-ed out" our memory consumption throttling level.

You see - we use a lot of caching of data in our processes; mostly because we access a lot of reference data frequently - data that does not change very often; this is by design. what we forgot to do is estimate the amount of memory this caching will require when many different clients use the system and adjust the throttling level accordingly.

As you can see from the image below - out of the box the BizTalk hosts are configured to throttle at 25% of the server's physical memory. the idea is to prevent the BizTalk processes from taking up too much memory and killing the server, and the assumption is that if throttling kicks in, and stops processing instances, memory consumption will slowly reduce until the server gets back to a more healthy state. however - from it's very nature - caching does not really release memory that often and so instances have stopped progressing but no memory was released as a result and so we got "stuck".

clip_image004

In our case, the solution was straight forward - as we know our memory consumption will be high, and we know there's nothing else running on the server to compete with that memory consumption (more or less) we could increase the threshold to 50%, which is enough to grant BizTalk Server enough memory for the caching and all the processing requirements.

In the process we monitored the situation by investigating two BizTalk performance counters - "Process memory usage threshold" (here shows as 500MB) compared to "process memory usage" (here showing around 130MB).

clip_image006

As long as there was large enough gap between the two we knew our processes are going to be just fine; it is always important, of course, to monitor these over time to ensure there's no memory leak in the processes, which we have done, on top of peak load tests - which we have not.

Now, while all of this is down to a test or two we may have neglected on our side, there are a couple of interesting points at the back of this from a product perspective -

  1. We were confused by what we saw mostly because of the "active" state of all instances (and we had quite a few); we would have diagnosed the problem much quicker, and on our own, had the admin console indicated that the server is not actually processing anything due to it's throttling state.

  2. I can't help but wondering whether the throttling mechanism couldn't be a bit more clever and identify it has reached a dead end and is not actually helping in improving the situation. following on our case the engine realised memory usage has gone too high and has stopped processing instances. wouldn't it be great if after, say, 10 minutes it realised that memory is not actually reducing and so it will never exit the throttling state and would write something to the event log?

Again - not trying to make any excuses, just thoughts with the power of hind sight...

Labels: , ,

Thursday, December 27, 2007

Mapper vs. XSLT round 2


I've received a good question today -

"we had a little debate in the office today - what is faster - running a map with pure xsl or the standard way with functoids, what you think?"

As I've
blogged before - I'm a big supporter of writing custom XSL and not using the Mapper and Functoids in anything other than the simplest of maps; so - although performance is only one of my arguments - the answer should be obvious.

Nevertheless I'll take the chance to answer properly again, although I suspect the question is not accurate enough -

At runtime there's no difference between the two; the Mapper generates XSL (which you can see by "validating" the map in visual studio and following the link to the XSL file generated which would appear in the output window, so the question should be, in my view, whether the Mapper can generate as good XSL as a developer could, but as you can imagine the answer really depends on a particular scenario - how many functoids are you using? how are they working together? what's the size of the map? what's its complexity?

Anyway, in my view there is a bottom line answer to that question and that is that under most real-world scenarios custom written XSL will almost always be better than generated one, but I'll try to explain a little bit more -

When you're using Functoids in your map you're generally doing one of two things - you're either calling external assemblies or you're adding some XSL lines to perform some actions for you.

The former one is easier to tackle - if you need external assemblies you can call them from custom XSL as well (as I've explained
here ); as the Mapper will do exactly the same, the performance impact will generally be identical in both cases (using mapper or custom XSL).

The latter is harder to tackle, as there's no one-rule-fits-all statement one can make - but here's a shot at it -

The Mapper is a visual, generic, designer that generates code.
As is always the case with these tools they come with a price, and that price is often the quality of the code generated; now - don't get me wrong - I don't argue that the Mapper is bad, or that it always generates bad, slow XSL; but if you know XSL well, there's no doubt you will write better code than a generator will.

When you're adding a Functoid that does not call an external assembly you'll be doing one of three things -

  • You will be adding an embedded c# code - most Functoids do this, look at the string manipulations as a simple example.

  • You will be adding a template based on input nodes - the Looping Functoid for example.

  • Or - You will be adding XLS structures or functions - the record count or value mapping Functoids for example


  • All three are perfectly fine, and even more so - if you'll try them out you'll see that the designer does generate quite a nice XLS in all cases.

    The problem starts when, and this is inevitable in the real-world, the maps get more complex.

    Once you move out of the playing ground and into real scenarios, the maps get more complicated and the inefficiency of the generated code becomes both more apparent (as multiple Functoids need to work together to achieve the desired output the XSL gets 'uglier and uglier') and that inefficiency becomes a greater problem as it is repeated many times over a large-ish map.

    Bottom line is from my perspective - if you feel comfortable with XSL (and the rest of the team) - you will always achieve better scripts than any generator would so use it. If you don't feel comfortable with XSL - learn it! It's easy! (and in the mean time use the mapper).

    Labels: , , ,

    Friday, September 07, 2007

    Another dive into the SOAP adapter's behaviour

    Being keen to fine tune our solution's performance we turned to look into the message exchanges between the SOAP adapter and one of the web services we're usign quite extensively.

    Two things became obvious very quickly (well, to be fair - they may have not been without some very useful tips from our colleagues working for the 3rd party providing that service) -

    • All request messages we've sent specify the "Expect: 100-continue" HTTP header
    • Neither of the request set out use pre-authentication, although we explicitly set credentials on the send port

    so we promptly turned to look into them -

    100-continue

    I'm no network expert; I hold only the basic understanding of HTTP you'd expect from someone dealing with messaging and integration; the way I understand it then is that if a requester sets the "Expect:100-continue" HTTP header on a request, the client would sends the HTTP headers only to the server and then wait to receive a response with "100-continue" status.

    It is only when the "100-continue" status is returned from the server that the client sends the actual request.

    I also understand that, in order to support older implementations of HTTP (or those that do not support 100-continue), if the client does not receive 100-continue it will send the request anyway, but only after the delay introduced while waiting for the server to respond.

    What all that means is about 200ms delay to every message sent from BizTalk.

    I'm sure Microsoft have given a lot of consideration to this before deciding this would be the default behaviour of web services in .net but to us, communicating with a long-term partner over a good leased line, it would have been nice to be able to turn it off and skip the extra step.

    Unfortunately, although spending some time on this we could not, as of now, find a way to do so.

    Our best bet was (and still is) to set the relevant setting in the ServicePointManager in our custom web service proxy, but this does not seem to have the desired effect.

    From a bit of reading we've done it seems there's some element of re-use here and that the ServicePointManager may use an existing service point, in which case our newly set property will not be used.

    It could be that this is created somewhere in the adapter before our custom proxy or something like that, we truly don't know at this point.

    Pre-Authenticate

    Fortunately this is an area we’ve had better success in, and it's good because it has a significantly bigger impact on performance than the 100-continue problem.

    The web service I've mentioned before uses basic authentication on the wire and then further elements of security in the message, so on the BizTalk port we've set the credentials the SOAP adapter should use when calling this web service, and, as you'd expect, this is working just fine.

    However, analysing the traffic going out of BizTalk on the wire we've noticed that for each message we send to the 3rd party, the SOAP adapter would transmit the message out without the security HTTP headers first, receive an HTTP 401 (Unauthorised) error and only then send the request with the HTTP headers baring the credentials we've set in the send port.

    Obviously, and especially when relatively large messages are involved, this has a significant impact on the performance of the server (and the farm as a whole as more bandwidth is used for the message exchange).

    Again, doing some reading and experimenting, we've learnt that this is pretty much down to the choices made around the DEFAULT behaviour of .net, and again, we thought, we had no control over it.

    But, while we could not do anything if we were using the standard configuration of the SOAP adapter, the fact that were using a custom web proxy to call the web service (for other reasons) meant we could try and tweak the way messages were transmitted, and tweaking we did.

    It didn't help us in the 100-continue case, but it definitely helped with the pre-authentication requirement - all we had to do is add a line setting the PreAuthenticate property of the HttpWebRequest to true before using it and BizTalk (well, the .net framework web service stack it uses) would quite happily provide the credentials up front avoiding the need to duplicate the round trip to the 3rd party server.

    Much better now!

    Labels: , , , ,