We are in the process of porting a legacy system to .NET, both to clean up architecture but also to take advantage of lots of new possibilities that just aren't easily done in the legacy system.
Note: When reading my post before submitting it I notice that I may have described things a bit too fast in places, ie. glossed over details. If there is anything that is unclear, leave a comment (not an answer) and I'll augment as much as possible
The legacy system uses a database, and 100% custom written SQL all over the place. This has lead to wide tables (ie. many columns), since code that needs data only retrieves what is necessary for the job.
As part of the port, we introduced an ORM layer which we can use, in addition to custom SQL. The ORM we chose is DevExpress XPO, and one of the features of this has also lead to some problems for us, namely that when we define a ORM class for, say, the Employee table, we have to add properties for all the columns, otherwise it won't retrieve them for us.
This also means that when we retrieve an Employee, we get all the columns, even if we only need a few.
One nice thing about having the ORM is that we can put some property-related logic into the same classes, without having to duplicate it all over the place. For instance, the simple expression to combine first, middle and last name into a "display name" can be put down there, as an example.
However, if we write SQL code somewhere, either in a DAL-like construct or, well, wherever, we need to duplicate this expression. This feels wrong and looks like a recipe for bugs and maintenance nightmare.
However, since we have two choices:
Then we came up with an alternative. Since the ORM objects are code-generated from a dictionary, we decided to generate a set of dumb classes as well. These will have the same number of properties, but won't be tied to the ORM in the same manner. Additionally we added interfaces for all of the objects, also generated, and made both the ORM and the dum objects implement this interface.
This allowed us to move some of this logic out into extension methods tied to the interface. Since the dumb objects carry enough information for us to plug them into our SQL-classes and instead of getting a DataTable back, we can get a List back, with logic available, this looks to be working.
However, this has lead to another issue. If I want to write a piece of code that only displays or processes employees in the context that I need to know who they are (ie. their identifier in the system), as well as their name (first, middle and last), if I use this dumb object, I have no guarantee by the compiler that the code that calls me is really providing all this stuff.
One solution is for us to make the object know which properties have been assigned values, and an attempt to read an unassigned property crashes with an exception. This gives us an opportunity at runtime to catch contract breaches where code is not passing along enough information.
This also looks clunky to us.
So basically what I want advice on is if anyone else has been in, or are in, this situation and any tips or advice you can give.
We can not, at the present time, break up the tables. The legacy application will still have to exist for a number of years due to the size of the port, and the .NET code is not a in-3-years-release type of project but will be phased in in releases along the way. As such, both the legacy system and the .NET code need to work with the same tables.
We are also aware that this is not an ideal solution so please refrain from advice like "you shouldn't have done it like this". We are well aware of this :)
One thing we've looked into is to create an XML file, or similar, with "contracts". So we could put into this XML file something like this:
This could allow us to code-generate those 8 classes (full class + 7 smaller variations), and have the generator detect that for variation #3, property X, Y and K is present, and I can then tie in either the code for the logic or the interfaces the logic needs into this class automagically. This would allow us to have a number of different types of employee classes, with varying degrees of property coverage, and have the generator automatically add all logic that would be supported by this class to it.
My code could then say that I need an employee of type IEmployeeWithAddressAndPhoneNumbers.
This too looks clunky.
I would suggest that eventually a database refactoring (normalization) is probably in order. You could work on the refactoring and use views to provide the legacy application with an interface to the database consistent with what it expects. That is, for example, break the employe table down in to employee_info, employee_contact_info, employee_assignments, and then provide the legacy application with a view named employee that does a join across these three tables (or maybe a table-based function if the logic is more complex). This would potentially allow you to move ahead with a fully ORM-based solution which is what I would prefer and keep your legacy application happy. I would not proceed with a mixed solution of ORM/direct SQL, although you might be able to augment your ORM by having some entity classes which provide different views of the same data (say a join across a couple of tables for read-only display).
我建議最終數據庫重構(規范化)可能是有序的。您可以使用重構和使用視圖來為遺留應用程序提供與數據庫接口一致的接口。也就是說,例如,將employees表分解為employee_info,employee_contact_info,employee_assignments,然后為遺留應用程序提供名為employee的視圖,該視圖在這三個表之間進行連接(如果邏輯是,則可能是基於表的函數)更復雜)。這可能會讓您繼續使用完全基於ORM的解決方案,這是我希望的並保持您的遺留應用程序的快樂。我不會繼續使用ORM / direct SQL的混合解決方案,盡管您可以通過提供一些實體類來擴充ORM,這些實體類提供相同數據的不同視圖(例如,跨幾個表的連接以進行只讀顯示)。
"We can not, at the present time, break up the tables. The legacy application will still have to exist for a number of years due to the size of the port, and the .NET code is not a in-3-years-release type of project but will be phased in in releases along the way. As such, both the legacy system and the .NET code need to work with the same tables."
“目前我們不能打破這些表格。由於端口的大小,遺留應用程序仍然需要存在多年,並且.NET代碼不是3年 - 發布類型的項目,但將在發布過程中分階段進行。因此,遺留系統和.NET代碼都需要使用相同的表。“
Two words: materialized views.
You have several ways of "normalizing in place".
Materialized Views, a/k/a indexed views. This is a normalized clone of your source tables.
物化視圖,a / k / a索引視圖。這是源表的規范化克隆。
Explicit copying from old tables to new tables. "Ick" you say. However, consider that you'll be incrementally removing functionality from the old app. That means that you'll have some functionality in new, normalized tables, and the old tables can be gracefully ignored.
Explicit 2-way synch. This is hard, not not impossible. You normalize via copy from your legacy tables to correctly designed tables. You can -- as a temporary solution -- use Stored Procedures and Triggers to clone transactions into the legacy tables. You can then retire these kludges as your conversion proceeds.
明確的雙向同步。這很難,並非不是不可能。您可以通過復制從舊表到正確設計的表進行規范化。您可以 - 作為臨時解決方案 - 使用存儲過程和觸發器將事務克隆到舊表中。然后,您可以在轉換過程中淘汰這些kludges。
You'll be happiest to do this in two absolutely distinct schemas. Since the old database probably doesn't have a well-designed schema, your new database will have one or more named schema so that you can maintain some version control over the definitions.
Although I haven't used this particular ORM, views can be useful in some cases in providing lighter-weight objects for display and reporting in these types of databases. According to their documentation they do support such a concept: XPView Concepts