From DasBlog To SubText
My site was down because of DasBlog today. I have no idea why it started to misbehave again, and I don't really care at this point, that was the final reason that I needed to move to SubText. Here is how I did it:
- Put App_Offline.htm page in the blog directory, since otherwise DasBlog would kill the server
- Back up the existing blog.
- Prepare a SubText skin matching my current blog look & feel.
- I then took the DasBlog to BlogML importer here, which is still not ready, and started to play with the code. This is definately something that would require a developer to do, by the way.
I had to give up on the trackbacks because they issues an IsInRole() call and I am not connected to the office at the moment. I took the short way our and sprinkled Thread.CurrentPrincipal = new GenericPrincipal(..) all over the place (seem to reset itself for some reason), and it worked. - I then tried imprted the resulting file to SubText. I have a large blog, and this meant that I had to increase the timeout values. I then discovered that my posts are not legal in SubText, it was warning me about issues with <script> tags in the posts and then failed with some whitespace error that took some time to check in depth. Appernatly I had literal "<<" somewhere, which SgmlReader interpreted as whitespace, and that caused an error when it tried to save that.
- The posts are appearing in reverse order (the oldest are on the top), that is a side affect of the way BlogML worked, by output it to the export file in reverse order, and SubText accepted it in order. The fix:
UPDATE subtext_content
SET DateSyndicated = DateAdded
- The next issues, categories, I am pretty strict about categories, and it looked like the move messed them up, I got mutliply categories with the same name, probably from multiply attempts to import the same set of data, I cleaned it up with:
DELETE FROM subtext_linkcategories
WHERE categoryid NOT IN ( SELECT categoryid
FROM subtext_links )
SELECT LTRIM(RTRIM(title)),
MIN(categoryid),
MAX(categoryid),
COUNT(*)
FROM subtext_linkcategories
GROUP BY LTRIM(RTRIM(title))
HAVING COUNT(*) > 1
UPDATE subtext_links
SET categoryid = ( CASE categoryid
WHEN 452 THEN 460
WHEN 464 THEN 527
WHEN 465 THEN 504
WHEN 477 THEN 508
ELSE categoryid
END )
If you haven't figured it out yet, I really like the ability to run queries on my blog :-) - The next target was the comments, I first run this:
update subtext_feedback
set isBlogAuthor = 1
where author = 'Ayende Rahien'
UPDATE subtext_config
SET commentcount = ( SELECT COUNT(*)
FROM subtext_feedback
WHERE feedbacktype = 1
),
PingTrackCount = ( SELECT COUNT(*)
FROM subtext_feedback
WHERE feedbacktype = 2
)
- Now, the issues of preserving the old URLs. That one is very important to me, since I have a lot of links directed to my site that I don't want to break. I first had to run a check to see how many duplicate entries name I had, and realized that I had merely 12 duplicate entries, this is something that I can live with. I should explain that the importing process didn't give me a way to correlate between the DasBlog entry and the SubText entry. What I did have was the entry title and date, which should be enough, since I didn't post entries with duplicated names on the same day. I created the following table:
CREATE TABLE subtext_MappingFromDasBlog
(
Id int identity(1, 1)
primary key
not null,
subtextId int references dbo.subtext_content ( id )
not null,
date datetime not null,
dasBlogId uniqueidentifier not null,
dasblogUrl nvarchar(255) not null
)
import System
import System.Xml
import System.IO
import System.Data
import System.Data.SqlClient
con = SqlConnection("Data Source=localhost;Initial Catalog=SubtextData;Trusted_connection=yes;")
con.Open()
xdoc = XmlDocument()
xdoc.Load("C:\\export.xml")
nsMgr = XmlNamespaceManager(xdoc.NameTable)
nsMgr.AddNamespace("blog", "http://www.blogml.com/2006/09/BlogML")
for node as XmlNode in xdoc.SelectNodes("/blog:blog/blog:posts/blog:post", nsMgr):
id = Guid(node.SelectSingleNode("@id").Value)
url = node.SelectSingleNode("@post-url").Value;
title = node.SelectSingleNode("blog:title/text()", nsMgr).Value;
dateCreated = date.Parse(node.SelectSingleNode("@date-created").Value)
print "Id ${id} url ${url} title ${title} date ${dateCreated}"
using cmd = con.CreateCommand():
cmd.CommandText = """SELECT Id FROM subtext_Content
WHERE Title = @title and year(DateAdded) = @year
and month(DateAdded) = @month and day(DateAdded) = @day""";
cmd.Parameters.AddWithValue("title", title);
cmd.Parameters.AddWithValue("year", dateCreated.Year)
cmd.Parameters.AddWithValue("month", dateCreated.Month)
cmd.Parameters.AddWithValue("day", dateCreated.Day)
subtextId as int = cmd.ExecuteScalar()
cmd.Parameters.Clear()
cmd.CommandText = """INSERT INTO subtext_MappingFromDasBlog
VALUES(@subtextId, @date, @dasBlogId, @dasblogUrl);"""
cmd.Parameters.AddWithValue("subTextId", subtextId)
cmd.Parameters.AddWithValue("dasBlogId", id)
# just the date, not the time
cmd.Parameters.AddWithValue("date", dateCreated.Date)
#trim the http://www.ayende.com/Blog/yyyy-mm-dd/cmd.Parameters.AddWithValue("dasblogUrl", url.Substring(38) )
cmd.ExecuteNonQuery()
con.Dispose()
Now, to get some use out of it...
SubText supports UrlRewriting, with HttpHandlers, so the first thing I needed to do is enable going to a post just by its id. I had to add this to the HandlerConfiguration/httpHandlers section:
<HttpHandler pattern="(?:/archive/\d+\.aspx)$" controls="viewpost.ascx,Comments.ascx,PostComment.ascx"/>
Now for the more complex stuff, I wanted to be able to support the following formats.- CommentView,guid,{guid}.aspx
- PermaLink,guid,{guid}.aspx
- NormalizedPostTitle.aspx
- /yyyy/mm/dd/NormalizedPostTitle.aspx
It took a while, but I manage to integrate that into SubText cleanly (an additional handler and changing the web.config), so the old urls should still work. The issue was mainly
Important: If you notice a borkne link please let me know about it.
- Syndication - RSS / Atom - I get a lot more readers on the site than on RSS, but it is obviously very important thing to keep. Nothing is as annoying as a broken link. In this case, dasBlog is using SyndicationService.asmx
- Binaries - I merely kept them in the same place (I am talking about images, attachments, etc.
The reason that I wrote all of that is that I would actaully remember what I did next time that I would have to upgrade / modify the blog.
The last thing that you need to know is how to configure the url mapping:
<HandlerConfiguration defaultPageLocation="DTP.aspx" type="Subtext.Framework.UrlManager.HandlerConfiguration, Subtext.Framework">
<HttpHandlers>
<HttpHandler pattern="login.aspx" handlerType="Page" pageLocation="Login.aspx"/>
<HttpHandler pattern="(?:/admin\/handlers\/BlogMLExport\.ashx)" type="Subtext.BlogML.BlogMLHttpHandler, Subtext.BlogML" handlerType="Direct"/>
<HttpHandler pattern="/Admin/" handlerType="Directory" directoryLocation="Admin"/>
<HttpHandler pattern="/Providers/" handlerType="Directory" directoryLocation="Providers"/>
<HttpHandler pattern="/rss\.aspx$" type="Subtext.Framework.Syndication.RssHandler, Subtext.Framework" handlerType="Direct"/>
<HttpHandler pattern="/SyndicationService\.asmx/GetRss$" type="Subtext.Framework.Syndication.RssHandler, Subtext.Framework" handlerType="Direct"/>
<HttpHandler pattern="/atom\.aspx$" type="Subtext.Framework.Syndication.AtomHandler, Subtext.Framework" handlerType="Direct"/>
<HttpHandler pattern="/SyndicationService\.asmx/GetAtom$" type="Subtext.Framework.Syndication.AtomHandler, Subtext.Framework" handlerType="Direct"/>
<HttpHandler pattern="/comments/commentRss/\d+\.aspx$" type="Subtext.Framework.Syndication.RssCommentHandler, Subtext.Framework" handlerType="Direct"/>
<HttpHandler pattern="/aggbug/\d+\.aspx$" type="Subtext.Framework.Tracking.AggBugHandler, Subtext.Framework" handlerType="Direct"/>
<HttpHandler pattern="/customcss\.aspx$" type="Subtext.Web.UI.Handlers.BlogSecondaryCssHandler, Subtext.Web" handlerType="Direct"/>
<HttpHandler pattern="(?:/category\/(\d|\w|\s)+\.aspx/rss/?)$" type="Subtext.Framework.Syndication.RssCategoryHandler, Subtext.Framework" handlerType="Direct"/>
<HttpHandler pattern="(?:/ArchivePostPage.aspx)$" controls="ArchivePostPage.ascx"/>
<HttpHandler pattern="(?:/LinkPage.aspx)$" controls="LinkPage.ascx"/>
<HttpHandler pattern="(?:/ArticleCategories.aspx)$" controls="ArticleCategories.ascx"/>
<HttpHandler pattern="(?:/archives\.aspx)$" controls="SingleColumn.ascx"/>
<HttpHandler pattern="(?:/archive/\d{4}/\d{2}/\d{2}/\d+\.aspx)$" controls="viewpost.ascx,Comments.ascx,PostComment.ascx"/>
<HttpHandler pattern="(?:/archive/\d+\.aspx)$" controls="viewpost.ascx,Comments.ascx,PostComment.ascx"/>
<HttpHandler pattern="(?:/archive/\d{4}/\d{2}/\d{2}/[-_,+\.\w]+\.aspx)$" controls="viewpost.ascx,Comments.ascx,PostComment.ascx"/>
<HttpHandler pattern="(?:/archive/\d{4}/\d{1,2}/\d{1,2}\.aspx)$" controls="ArchiveDay.ascx"/>
<HttpHandler pattern="(?:/archive/\d{4}/\d{1,2}\.aspx)$" controls="ArchiveMonth.ascx"/>
<HttpHandler pattern="(?:/articles/\d+\.aspx)$" controls="viewpost.ascx,Comments.ascx,PostComment.ascx"/>
<HttpHandler pattern="(?:/articles/[-_,+\.\w]+\.aspx)$" controls="viewpost.ascx,Comments.ascx,PostComment.ascx"/>
<HttpHandler pattern="(?:/\d{4}/\d{2}/\d{2}/[\d\w]+\.aspx)$" type="UrlRewriting.For.DasBlog.FromDasBlogToSubText, UrlRewriting.For.DasBlog" handlerType="Factory" />
<HttpHandler pattern="(?:commentview,guid,[0-9a-fA-F]{8}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{12}\.aspx)$" type="UrlRewriting.For.DasBlog.FromDasBlogToSubText, UrlRewriting.For.DasBlog" handlerType="Factory" />
<HttpHandler pattern="(?:commentview\.aspx)$" type="UrlRewriting.For.DasBlog.FromDasBlogToSubText, UrlRewriting.For.DasBlog" handlerType="Factory" />
<HttpHandler pattern="(?:/contact\.aspx)$" controls="Contact.ascx"/>
<HttpHandler pattern="(?:/posts/|/story/|/archive/)" type="Subtext.Web.UI.Handlers.RedirectHandler, Subtext.Web" handlerType="Direct"/>
<HttpHandler pattern="(?:/gallery\/\d+\.aspx)$" controls="GalleryThumbNailViewer.ascx"/>
<HttpHandler pattern="(?:/gallery\/image\/\d+\.aspx)$" controls="ViewPicture.ascx"/>
<HttpHandler pattern="(?:/(?:category|stories)/(\w|\s)+\.aspx)$" controls="CategoryEntryList.ascx"/>
<HttpHandler pattern="(?:/comments\/\d+\.aspx)$" type="Subtext.Framework.Syndication.CommentHandler, Subtext.Framework" handlerType="Direct"/>
<HttpHandler pattern="(?:/services\/trackbacks/\d+\.aspx)$" type="Subtext.Framework.Tracking.TrackBackHandler, Subtext.Framework" handlerType="Direct"/>
<HttpHandler pattern="(?:/services\/pingback\.aspx)$" type="Subtext.Framework.Tracking.PingBackService, Subtext.Framework" handlerType="Direct"/>
<HttpHandler pattern="(?:/services\/metablogapi\.aspx)$" type="Subtext.Framework.XmlRpc.MetaWeblog, Subtext.Framework" handlerType="Direct"/>
<HttpHandler pattern="(?:/default\.aspx)$" controls="homepage.ascx"/>
<HttpHandler pattern="(?:/[\d\w]+\.aspx)$" type="UrlRewriting.For.DasBlog.FromDasBlogToSubText, UrlRewriting.For.DasBlog" handlerType="Factory" />
<!-- this needs to be last -->
<HttpHandler pattern="(?:((\/\/default\.aspx)?|(\/\/?))?)$" controls="homepage.ascx"/>
</HttpHandlers>
</HandlerConfiguration>
And you can find the code for the actual URL mapping here.
It is simple, it works, and that is all I care about.
If you haven't guess, this migration was NOT a walk in the park. I hope that I got all the kinks out, but let me know if there is anything that I missed.
Comments
Comment preview