Yossi Dahan [BizTalk]

Google
 

Monday, May 05, 2008

Do we need schemas?

This is somewhat of a recurring theme with me recently, but I want to discuss the contents of the management database; more specifically I want to discuss the fact that schemas get deployed to it and that most other things deployed will have a strong dependency on schemas.

As schemas are always at the bottom of the dependency chain, this means is that on top of the expected difficulties one can experience when needing to change schemas and the impact on other system, the actual act of deploying a new schema.

At best this is simply an annoyance to a developer who needs to re-deploy his entire solution as the schema evolves through the development cycle (versioning is not applicable in this scenario);

At worst this is an operational nightmare if a solution has to be updated/patched/evolved where a good versioning story does not exist (as is all too often the case, not that versioning would have solved this all).

As we are forced to remove the entire solution and then re-deploy with the new schema, we can expect, from my experience, the process to take quite a while for large solutions, which may take the business offline for a couple of hours.

Taking the risk of making a point about something I don't know enough about - the internal behaviour of BizTalk server with regards to deployed schemas (but one could say this is often the case...) - I would argue that as far as I can tell, schemas are not actually used all that often by the runtime.

(and because I accept I could be completely wrong here, please do share any thoughts/ideas/comments/insights/whatever on the subject - put a comment on this post or email me if you prefer. I'd love to hear some feedback on this.)

Anyway - as I was saying -

When you define a message type you select the schema at design time, and the designer may refer to that schema to do various things - draw the map designer, check validity of assignments in expression shapes, build intellisense, it would even check serialisation an de-serialisation attributes on classes vs. your schema when you try to assign a .net class to a message in an expression shape, but as far as I'm aware, the schemas are rarely used by the runtime.

At runtime, when message is received into an orchestration (and set to a pre-defined message type), it's contents are not checked against the schema; neither does it get validated at the end of a transform or message assignment shapes.

When you run a map you select a schema, but again - that map could well return something completely different; BizTalk couldn't care less.

When do I know schemas get used? in the pipelines. sometimes.

If you're using the XmlDisassembler for example it would try to resolve the message type based on the message's root node and namespace, and then try to get the schema from the database.

the disassembler may then use this schema to promote some properties, if configured it may debatch the message according to the schema and possibly use it to validate the message; all are very valid usages for the schema but - they are not always used, and they require specific configuration, either in the schema at design time or in the pipeline component (or both).

Also, at least with regards to property promotion, all that get's used is a bunch of xpaths provided in an annotation in the schema, not the actual schema information.

There are, of course, other cases where schemas are required - FlatFileDisassembler, XmlValidation, Xml and FlatFile Assemblers all need schemas for their work (to some extent at least) and definitely the design time environment uses them extensively, but what I'm arguing is - can we do without having to deploy schemas if they are not used?

BizTalk works in a late-binding fashion anyway, where assemblies and their contents are loaded from the GAC/database as needed (and may be unloaded after a period of them not being used), couldn't we get away with only deploying the schema when it is needed at runtime, and simply 'register' message types when it is not?

In fact - even if a schema is needed at runtime - why does it need to exist in the database? how is it different from maps, pipelines, orchestrations? all of which are 'known' to the database but physically exist only in the GAC? (well, that's not accurate - the orchestration's structure is stored, as XML in the database, but that's to be displayed in HAT, and possibly a bad design decision on it's own)

I can't help thinking I'm missing something, I'm sure the guys behind BizTalk's decision had given it a lot of thought and found good justification for it, wouldn't they? anyone can comment on what those might be?

One argument could be that BizTalk wants to know which messages are 'supported' by the solution - just as a message arriving with no subscription is considered an error, a message arriving which is not of a known 'type' should be considered an error. but in a sense - the two are the same, and in any case BizTalk is quite happy to support 'blob' messages through the use of passthrough pipelines and XmlDocument as a message type in the orchestrations.

Labels: ,

Sunday, February 03, 2008

GAC stings again....

....or is it just me being stupid again...I don't know...


But anyway - I was working the other day making some changes to one of our classes and, as most of our classes have both code and schema representation, I went on to generate the schema from the modified class.

So - I've built the project with the class to generate an assembly, and in Visual Studio command line I swiftly 'CD'-ed to the bin folder the project and used XSD.exe to generate the schema in order to replace my existing one to reflect the recent changes.

However, to my great surprise, the generated schema did not have any of the added nodes I expected to see - it looked suspicously identical to the old schema I had.

It took me quite a few attempts before I've realised what's happening - as you can imagine from the title of this post the answer lies in the GAC - I had the older version of the assembly containing class in the GAC and so XSD.exe, although being pointed by me a specific assembly in the bin folder, has decided to pick the old one from the GAC and use it to reflect the class.

As soon as I removed the old version from the GAC XSD.exe was happy enough to generate the correct schema for me.

Labels: , ,

Thursday, August 23, 2007

elementFormDefault in schemas

This is one of those things that, when finally spotted, leave you wondering how on earth you could have not notice them before. in fact -I'm not quite sure I do fully understand this even now, so I'll be very interested in hearing other people's thoughts on this.

I'm working on the integration with a 3rd party web service and they send us their specification. Their request schema looks something like:


First thing I did was to add this schema to a BizTalk project and generate an Xml instance, which looked like:


This buffeled me for a while as, unlike what I've expected (and if I'm not mistaken), requestField1 and requestField2 do not "belong" to the "3rdPartyNamespace" namespace, but rather to an empty namespace.

I was expecting the Xml to look like -


But, as is apparent from the screen shot, VS was not happy (nor was the XmlValidatingReader class when I checked)

In order to make that last Xml valid I had to modify the schema and set the elementFormDefault attribute of the schema to qualified.

Having read a little bit about the elementFormDefault I now officailly decalre that "I don't like it!" - mostly because it can effectively be used to "drop" namespaces from schemas (created or imported) and by doing this mis-represent how the xmls should looked like.
Note: I guess this statement deserves a blog post by it's own, but I'm not going to go there (at least not now), I'm just interested in getting the schemas in my project correct, which would definitely, at this point, involve making sure they are all, always, qualified (and I believe they are, as I usually make the point of setting it explicitly, luckily).

I'm not quite sure why this is the default behaviour if this attribute is not specified, but that’s a question for the W3C, what is even more interesting is that BizTalk introduced a new behaviour here to Visual Studio, which only further confused me -

If you add a new item to a BizTalk project by right clicking on it in the solution explorer and selecting "Add New Item" - the schema generated for you will not include the elementFormDefault attribute and so it will effectively be unqualified; if you do the same on a c# project or simply ask Visual Studio to create a new xsd file (outside the context of a BizTalk project, using the File menu) the schema generated will include the elementFormDefault="qualified" attribute.

Another bit that confused me is that I thought that by adding the "xmlns" and "targetNamaspace" attributes at the root of the schema I effectively tell the schema all child elements should belong to the same namespace, but this is not the case, of course, the way to look at this, I guess, is that a schema is, of course, an xml document, so these attributes simply refer to the namespace of child elements of the schema, which in my case is irrelevant as they are all prefixed by "xs:"
I don't know how the 3rd party guys created their schema, but it looks like they've missed the same point as I did and hopefully, assuming I'm right, they will consider changing it to be qualified; in the meantime - I though it is worth sharing this point.

Labels: ,

Thursday, August 02, 2007

Editing Xmls with intellisense in Visual Studio

Am I the last person in the world to realise that? hopefully not!...

...but I've just realised today that if you edit an xml file in visual studio and you have the schema of the xml you're editing open in another tab it will recognise it (as soon as you type the root node and namespace) and will provide intellisense.

I knew you could tell it to use a schema, but I did not realise it will automatically include opened documents. brilliant feature!

Labels: , ,