Exporting documents from SharePoint using RoboCopy / RichCopy utility

I was recently asked if there was a way to batch copy files out of SharePoint and be able to log success / failure. There are many tools that import files into SharePoint but not much that does the opposite. So after some thought I looked into using RoboCopy and found a great GUI version called RichCopy which can be downloaded from here http://technet.microsoft.com/en-us/magazine/2009.04.utilityspotlight.aspx?pr=blog.

To use this utility with SharePoint you will need to map the document libraries individually to a network drive, for example:

NET USE X: http://sharepoint/documents

Once the individual document library has been mapped you are then able to specify the X drive you have mapped in RichCopy and specify a local directory to copy/move the files to.

image

Click on options and select the various options that apply for you.

image

You will notice that when you do export the documents it also exports the forms folder containing allitems.aspx etc. You can filter this folder so this does not copy across to the destination folder by enabling advanced options which also includes logging configuration.

Please note that this solution does not map SharePoint metadata. If this is a requirement then it is advised you look at the 3rd party offerings available to achieve this.

Blocking Access to SharePoint Web Services in an Extranet / External Publishing Scenario

This article discusses a method of blocking access to SharePoint web services from external connections.

To do this you will need a publishing server such as Microsoft ISA Server / Forefront TMG or 3rd party application, I would also strongly advise that if you haven’t planned for such a server then revise your design to include one (preferably 2 for clustering, HA etc..).

So assumptions at this stage are that SharePoint is installed and ready to be published. A further assumption is that you have already created a publishing rule on ISA for the SharePoint Web application you want to publish and it is correctly configured and publishing SharePoint successfully.

The next step is to create a new standard web publishing rule (not a SharePoint rule) and place this ABOVE the SharePoint publishing rule for the main site – remember in ISA the rules are based on ordering.

So basically at this point what we want to do it block access to the SharePoint /_layouts/_vti_bin folder.

Call the Publishing Rule some thing like ‘Extranet Web Service Block Rule’ use the same web listener as that you have published SharePoint with.

image

Select the Paths tab and remove any entries. Then add a new path as follows:

image

What this will do is redirect anyone trying to access the _vti_bin folder to the accessdenied.aspx page blocking anyone from connecting to the webservice asmx files.

It is more than likely that the rule will need tweaking for authentication to work correctly so be prepared to spend sometime testing this to get it right.

An excellent post I would recommended around SharePoint Extranet best practices and lockdown is Joel Oleson’s post here.

The purpose of this post is by no way as a step-by-step guide to publishing SharePoint via TMG / ISA but simply a guide to blocking web service access.

The consequences of blocking the web services will certainly impact on functionality of SharePoint externally and it usage and it is advised that such a change should be fully tested to make sure loss of functionality is not experienced by end users. Further tweaking to specific web services can be achieved using path mapping.

Auto Approving Documents in SharePoint Document Library

Ok so this sounds kinda strange right? You ask the question why would you want to auto-approve documents in a library where there is approval set on the document library.

Well my scenario is this, I have project documents some which need approving and then some that don’t. I hear you say why not just have 2 document libraries for this, well I could but I setup some views to group the documents by content types / columns and to avoid customising SharePoint by creating a content query web part with custom XSLT I looked into using a SharePoint Designer workflow to accomplish this.

The other thing to remember here is that I have setup the document library to only show readers major versions of documents to readers of the document library. I also have the require checkout before editing the document enabled. See the screenshot below for advanced settings for the library:

image

So I have indentified specific content types that do not require approving and need to auto approve these documents. My challenge here is that once the workflow tries to change the document to ‘Approved’ it may see that the document has been modified and then change it back to ‘draft’, the document also needs to be checked out to make this change so this could cause a problem.

So below shows the step I created to attempt to update the approval status.

image

So here was the first attempt at running the workflow which failed due to the document not being checked out

image

So adding check out and check in around the update of the approval status is the next logical step.

image

And this was the error I got – unknown error – very helpful!

image

So I concluded that the only way to do this was to set ‘Require Check Out’ to ‘No’ under version settings for the document library.

image

And…. It worked - success!

So to conclude you can’t auto approve certain content types in a document library with ‘Require Check Out’ enabled for the document library. From a business impact disabling this option could result in multiple people editing the same document so this would need to be considered vs. the inconvenience of documents remaining in draft when they don’t need approval.

Using Google Analytics with Internal (Intranet) SharePoint Sites

This article applies to both MOSS and WSS and is very easy to implement.

You may ask the question how does Google know about an internal user accessing a SharePoint site, well its simple. The Google script that runs is done so every time a user accesses the site and is therefore client side so it is the client session that registers the hit with Google Analytics (GA).

To implement Google Analytics successfully you will firstly need a GA account. From this account enter the SharePoint URL e.g. HTTP://intranet (don’t worry about Google resolving this to your SharePoint site because obviously it won’t!) follow the instructions and you should generate some code similiar to below:

<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "
https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
try {
var pageTracker = _gat._getTracker("UA-XXXXXXXXX-X");
pageTracker._trackPageview();
} catch(err) {}</script>

Copy the code and place this into the active master.page for your SharePoint Site under the <HEAD> section and not just above the </BODY> section as is suggested in the Google walkthrough.

Hit the pages a few times and then keep checking your GA for stats. It will take up to 24 hours to update these stats however when you first implement your code the GA site will show a yellow warning triangle suggesting the analytics code hasn’t been installed, after a couple of hours this should change to confirm it has.

Wait 24 hours and you should then see some results come through and you’re done!

Credit to Mike Knowles blog for his excellent guide on this.

Connecting SharePoint RSS Viewer web part to a SharePoint List RSS Feed

If you’ve ever tried to connect an RSS Viewer web part to a SharePoint List on the same web app / site collection you may have received the following error when you try to enter the feed:

The RSS webpart does not support authenticated feeds.

So from a quick browsing of the web I found very little that updated on Mark Arend’s blog. This talks about the need for Kerberos authentication if you require the list supplying the RSS feed to require authentication rather than allowing anonymous access which may or may not be an option depending on the uses of SharePoint.

Kerberos is somewhat a little challenging to setup and if you do need to have authenticated feeds there are some great Kerberos overviews for SharePoint by a colleague of mine Mike Cox which you can check out here. Just so you know enabling Kerberos in SharePoint isn’t a case of just enabling the option in Authentication Providers in SharePoint Central Admin.