More servicesWindows Live
HomeHotmailSpacesOneCare
 
MSN
Sign in
 
 
Spaces home  MOSS & Collab Architectu...PhotosProfileFriendsMore Tools Explore the Spaces community
Updated 5/19/2005

MOSS & Collab Architecture

where the SDK rules
View space
sriwan
View space
Giambruno
View space
cheng
View space
David
View space
James Butler
View space
Ilya
View space
(no name)
View space
David C. Williams

February 14

Index as Dedicated Front End for Crawls?

Hmm Joelo raises an interesting point regarding the use of the Index server as a Dedicated WFE. Aside from the problem that Dedicated WFEs are not all of the stuff that they should be due to a silly implementation depending on the hosts files.. which has also been discussed in detail... IMHO.. if you want, you can manually configure a more robust solution and even use multiple front end servers for dedicated crawling if necessary just by using 2 hostnames "crawlmeplease" vs "companyportal" and load balancing access accordingly.. but that's another story.

Generally, I look at the Network as something "available" and CPU as something that is "scarce"; however, I've been in situations where the opposite was the case [all traffic going through same switch on a busted network card.. i.e. 10 Mbit/sec...]. Thus, trying to limit "hops" is not really important in my opinion [there is no real impact on the SQL server, as the content has to be retrieved anyway]

Frankly, I think that Index server is the weakest link in the V3 platform [MOSS 2007], and should be given as much elbow space as possible, especially in those large farms with a lot of difficult content to index.. and I think they a dedicated WFE should not be the index server, at least based on some of the following criteria:

  • Formula: check the difference in different activities that your server does, from crawling, to indexing, to propagation, to general server stats [CPU, memory, Disk I/O, network], on both servers in the mix
  • I bet you $2 that crawling [even a full crawl] will take about 1/10 time, and the rest of it will be content retrieval and indexing, propagation of the index will be fast as well.. if you have a ton of content.. maybe separation is advisable.. it is not really the network problem but the CPU problem
  • Depending on the crazy content stored in your site [e.g. PDF files] the indexing part may take for-ever-and-ever, especially if you have a lot of lame IFilters that are single threaded and crash in the middle of a document.. or can't address memory or something else... verdict: separate index from WFE
  • Perhaps another reason for slowness [this time it is the WFE, not the index server] is not an index server but poorly written ASPX pages and Web Parts that someone insists on indexing, and perhaps the index server could deal with 20 of them at a time, but the WFE can only serve 5 simultaneously without spiking the CPU.. verdict: separate WFE from index

But.. don't trust my observations and please do these yourself. It may be that your Index server has enough capacity, and separation [or even a dedicated crawl WFE is not necessary at all]. Meanwhile, if you have a chance, encourage MS to invest some money in the ability to distribute:

  • crawling
  • indexing
  • index merging

Into some kind of a "computing cluster". That would be neat.

January 23

Boatload of resources released

Finally the long awaited documentation and samples have been released. I was expecting them to be posted at least a week back, but they are welcome anyway.

Plus I wanted to congratulate MSDN and Technet teams on doing a very good job in providing excellent documentation. It seems that current MOSS/WSS blogosphere is still playing catch-up with the documentation, which IMHO means that they did a great job. My favorite starting point [when dealing with a task I haven't done before is the "How Do I..." section. It typically contains a wealth of conceptual information along with a code sample to get a particular task done. A great starting point. Also, from past experience, when you send some feedback it typically gets implemented [although after some cycles].

My favorite developer/architect resources:

SharePoint Server 2007 SDK: Software Development Kit and Enterprise Content Management Starter Kit

MOSS SDK provides a ton of useful starting points for workflow and records management, a couple of White Papers, as well as all of the stuff from WSS SDK [see links for 3 fixes that are required for complete installation with VS procedures] and way more. NOTE: online version of MOSS SDK is here

Windows SharePoint Services 3.0: Software Development Kit (SDK)

Please read the detailed installation for WSS [the workflow project types won't show up if you don't follow, and other stuff]. NOTE: online version of WSS SDK is here

WSS SDK contains help on:

  • Web Part Framework
  • Server-side object model
  • Web Services
  • CAML
  • Master Pages
  • Workflows [and workflow templates for VS 2005]
  • Custom Field Types
  • Information Rights Managements
  • Document property info
  • Search

My favorite IT Pro/architect resources:

...well... there is just Technet and many many blogs.

Planning and architecture for Office SharePoint Server 2007

Deep inside this guide there is a very cool toolkit for load testing [among other things.. or pre-loading]. This toolkit needs a special shout-out. If you are developing on SharePoint [and are not necessarily an admin] you can use it to populate portals, sites with list and document data, test it, and then delete. Great utility if you need to do some heavy unit and regression testing. Read about the Tools for performance and capacity planning here.

Microsoft Office SharePoint Server TechCenter

In the past the "admin" help was downloadable.. but now it is not. I hope someone compiles the Technet Planning guides into a comprehensive admin help file.

Instead it is packed into some weird books..
Planning and architecture for Office SharePoint Server 2007

Deployment for Office SharePoint Server 2007

Planning and architecture, deployment, and operations for Office SharePoint Server 2007 for Search

SharePoint vs. File Shares

Each time I venture to do some mid size and enterprise consulting, there is always the question of using SharePoint as a replacement for File Shares, or the typical Z: or G: drive mapped in an enterprise. Now, joelo has added some more more discussion in his File Servers and SharePoint Doc Libraries entry. I just wanted to add some more caveats where SharePoint simply may not work as well, and places/reasons where I recommend SharePoint 200%.

Potential Problem Areas [not addressed before]

  1. Long Paths - WebDav has a limit of 260 chars, and so does MOSS/WSS [or less]. I can't seem to find any references for 2007 on Technet so here is an old one, but you can run a quick test... create "Really Long Folder Name 123456" 8 times and an error will occur. Such a long pathname is really hard to find, but it is still possible [although many other systems will have the same issue]. Recommendation: check paths/lengths before migrating.
  2. Weird files with inter-file references - the most common problem here includes Excel spreadsheets that are typically peppered with references to data stored in other excel files, or help-style references to other documents. This will be difficult to migrate into SharePoint [or other system], as the references typically use a absolute file handle. If you migrate these into SharePoint.. expect to make a lot of modifications, and make sure you add proper clean-up time in your project if you discover these. Recommendation: you will have to clean up no matter what solution you choose, this is a brittle design anyway.
  3. Weird file names - Infrequently users will have filenames that are like test...txt or test#%.txt as a requirement of some application, or as part of a file naming standard. All these will be a no-no [there is an official list of disallowed characters in SPS 2003 admin guide if I recall properly]. Also folders and file names must be 128 characters and less. Recommendation: check file names, and figure out a decent way of substituting these characters without making an embarrassing mistake.
  4. Weird applications that do some crazy file locking a la Access - obviously Access is just one of the applications that depends on the file system beyond just storage [I guess others would include some engineering documents, or streaming audio]. These will simply not benefit from SharePoint because they are specifically dependent on the OS file system to perform [e.g. access creates lock files, streaming audio works better when only chunks of file are sent]. For these types of files, I'd recommend using a specific application server, or retain a standard file share.
  5. Weird applications that use the files and create files - occasionally, the files that are created by users are also reused by some other enterprise applications. Make sure and test that the 3rd party application can actually work with SharePoint as part of your project plan.
  6. Security - SharePoint security is only different than what you're used to in file system. You can't really have a "deny", as well as some other options that are typically ignored. If you happen to utilize one of those security options, you must rethink. Recommendation: check if security can be re-factored or updated to fit SharePoint, otherwise keep SharePoint.

Super-duper benefits of using SharePoint included in my top 5 benefits are below. Just one note of advice. When planning on using these features, plan on training your users on how to leverage these. It should not take more than a 1-2 hour session [in the area of file usage]. 

  1. Versioning, publishing, workflow - especially useful when working in a group of people. Every place that uses standard file system always runs into file editing collisions [multiple simultaneous edits], or ends up with silly file names ["proposal version 1 tom.doc", "proposal Mary v2 -review.doc"] this just leads to confusion and long term disaster. Recommendation: use SharePoint [MOSS if workflow needed], deploy Versioning [and some pruning]. Savings: 2-3 hours a week/person in reconciliation, approval.
  2. Backup/Recycle bin - typical problem with a standard file share, or even a local My Documents folder are accidental deletes of files. It takes 3 days for someone to realize it was deleted, and then a week to get it back. With SharePoint, you get it back immediately [or fairly quickly, even if you deleted it from recycle bin and have to ask an admin], most likely there is no backup when you save files locally. Recommendation: use OOB SharePoint [WSS works]. Savings: enormous if accidental deletion occurs.
  3. Access - when storing files in a share, you can typically only access it from few computers, then you have to do some crazy tricks to copy the files to open them at home, or even a different office. With SharePoint, files are available from desktop, intranet, or even the extranet if your company provides this feature. Recommendation: OOB SharePoint [WSS works]+ extranet. Savings: 1-2 hours a week/person in synchronization.
  4. Metadata - this is not necessarily obvious to many people that do not deal with metadata, but there is always a ton of implicit metadata to be found, even in a straight file system. First you start off with a folder/directory structure [people can store proposals, separate from invoices, separate from documentation, etc] and moving onto file names ["proposal for XYZ LA branch for 2006.doc"]. Eventually, there are 300 folders with 1 document in each folder, not an optimal solution for finding a document. It is so much easier to deal with the documents when you can see a collection of 10-200 documents and use metadata columns to browse/search the documents instead. Document folders could be used as security or team containers. And when the number of documents is too large, create more folders. Recommendation: rethink your organization of documents, use metadata [including site columns, preferably], roll out WSS. Savings: better reuse of data, impacts search. Cons: this, if decided by committee, may take a long time to implement.
  5. Search - for those people that have ever tried to find a file on a file share, you know it's like looking for a needle in a haystack. Whether is it 100 MB or 20 GB, it is almost always a very lengthy search [and typically locks up just about every resource available]. With SharePoint it is instantaneous, filtered, ranked and includes full text of the documents. Recommendation: use SharePoint [MOSS]. Savings: 1-3 hours a week per knowledge worker.

Now, the final question/problem I would have: shell integration [also, why does Windows XP and Vista in general shy away from Tree display?]. Is the ball dropped due to the bundling problem? SharePoint 2001 had a much better integration story with the desktop [property inspection, check-in/check-out, etc.] now it is gone....

Whenever considering SharePoint as file storage, the benefits are clearly enormous, with straight out-of-the box implementations, but keep in mind that there are these few scenarios where you may run into some difficulties, and check for these before ruining your reputation. Make sure your knowledge workers are trained to reap the benefits which could be 5+ hours a week.

November 13

Will IT Departments be ready for Office Server 2007?

Wow. Has anyone explored the depths of Office 2007 administration?

As much as people complained about the administration of the SPS 2003 server [which was quite justified], there will be different types of complaints for the Office 2007 administration.

Whereas in the past [or currently] the admin pages were thrown around all over the place, and some pages were hard to navigate to, the current set of admin pages is laid out much much better, but unfortunately, the new Office 2007 product has about 10x more features packed in. As I have been observing many skilled people in action get lost in simple security assignment, I believe it may be a hard transition to some people. Luckily, for those who are independently wealthy, or have wealthy companies, there is Admin 2007 Training available from Mindsharp guys. I've met them a couple of times at MS events in the past, and they were all quite talented. As I haven't quite seen a lot of other Admin training [including from Microsoft], I recommend you, or someone you know attends. One of my colleagues will be attending the December class in DC [sold out], I'll update with his experience.

Well, speaking of administration of SharePoint 2007. There is a quick and simple rule to follow. Learn the "logical" hierarchy:

  • Farm
    • Application
      • Site Collection
        • Site

After that, you'll need to figure out what features are available at each level, and viola, you are an admin. You'll also have to learn about special things like Shared Services and parts of the application services. Shared Services are kind of a SOA implementation that is consumed by other applications in the farm.

Most people forget about this hierarchy when setting the admin security. Generally, you have to figure out what part you are currently trying to administer. People forget that just because you are an app admin, it does not mean you can manage a site collection. You can "add yourself" to the site collection admin [role?], but you can't manage the site before that happens [and if you do add yourself to the site collection admin, the operation will get logged in the system log!]. I'll do another entry on planning infrastructure and administration operations for a MOSS 2007 farm.

Ah I forgot about the infrastructure part. The infrastructure is a bit easier to remember, as the infrastructure is probably infinitely more flexible than in SPS 2003. Just keep the front end servers separate from app servers, and you're on the way to achieve a good balance. Next, just monitor the performance, and some other counters, and add servers where it is most necessary.

Next post should be on some development.. how about the admin APIs?

Say No to /3GB in boot.ini

Some time ago, we decided to optimize server settings, as recommended by MVPs and other industry gurus. Basically, if it was in a Power Point slide deck, or in someone's blog, we tried it out. Unfortunately, we didn't quite see The Old New Thing blog entry which set things straight.

Unfortunately, the "/3GB" caused more grief than we anticipated... What happened [even MS Premiere seemed to be dumbfounded for a month] is that after some times we'd get crazy calls from people complaining that they cannot open "large office documents", with "Internet Explorer cannot download ... from ..." error. Sometimes it would be a 1 MB file, sometimes it would be 512 KB.. we'd scratch our heads, observe it only happens on 1 of 2 front-end servers, reboot the machine [after office hours] and try to provide access via alternative means. All for naught, any browser, any protocol, any machine [remote or localhost], same error [except it all worked with the other FE server!]. We called for support, nothing happened. We broke up few portals between another set of servers [new farm], and our problems went away... but only for a while. The traffic picked up, and once again, the problems started appearing.

Luckily, with proper documentation of all symptoms, settings and the hardware, and probably a different MS Support engineer, when we called again, the problem was solved in less than 30 minutes [including wait time]. Bottom line is.. as soon as the memory consumption on the system would exceed 3 GB, the problems would start. Removing the /3GB switch, which we carefully placed there 6 months earlier, was quite embarrassing.

View more entries
 
No list items have been added yet.
No list items have been added yet.
No list items have been added yet.
No list items have been added yet.
No list items have been added yet.