We are in the process of porting a legacy system to .NET, both to clean up architecture but also to take advantage of lots of new possibilities that just aren't easily done in the legacy system.
Note: When reading my post before submitting it I notice that I may have described things a bit too fast in places, ie. glossed over details. If there is anything that is unclear, leave a comment (not an answer) and I'll augment as much as possible
The legacy system uses a database, and 100% custom written SQL all over the place. This has lead to wide tables (ie. many columns), since code that needs data only retrieves what is necessary for the job.
As part of the port, we introduced an ORM layer which we can use, in addition to custom SQL. The ORM we chose is DevExpress XPO, and one of the features of this has also lead to some problems for us, namely that when we define a ORM class for, say, the Employee table, we have to add properties for all the columns, otherwise it won't retrieve them for us.
This also means that when we retrieve an Employee, we get all the columns, even if we only need a few.
One nice thing about having the ORM is that we can put some property-related logic into the same classes, without having to duplicate it all over the place. For instance, the simple expression to combine first, middle and last name into a "display name" can be put down there, as an example.
However, if we write SQL code somewhere, either in a DAL-like construct or, well, wherever, we need to duplicate this expression. This feels wrong and looks like a recipe for bugs and maintenance nightmare.
However, since we have two choices:
Then we came up with an alternative. Since the ORM objects are code-generated from a dictionary, we decided to generate a set of dumb classes as well. These will have the same number of properties, but won't be tied to the ORM in the same manner. Additionally we added interfaces for all of the objects, also generated, and made both the ORM and the dum objects implement this interface.
This allowed us to move some of this logic out into extension methods tied to the interface. Since the dumb objects carry enough information for us to plug them into our SQL-classes and instead of getting a DataTable back, we can get a List back, with logic available, this looks to be working.
However, this has lead to another issue. If I want to write a piece of code that only displays or processes employees in the context that I need to know who they are (ie. their identifier in the system), as well as their name (first, middle and last), if I use this dumb object, I have no guarantee by the compiler that the code that calls me is really providing all this stuff.
One solution is for us to make the object know which properties have been assigned values, and an attempt to read an unassigned property crashes with an exception. This gives us an opportunity at runtime to catch contract breaches where code is not passing along enough information.
This also looks clunky to us.
So basically what I want advice on is if anyone else has been in, or are in, this situation and any tips or advice you can give.
We can not, at the present time, break up the tables. The legacy application will still have to exist for a number of years due to the size of the port, and the .NET code is not a in-3-years-release type of project but will be phased in in releases along the way. As such, both the legacy system and the .NET code need to work with the same tables.
We are also aware that this is not an ideal solution so please refrain from advice like "you shouldn't have done it like this". We are well aware of this :)
One thing we've looked into is to create an XML file, or similar, with "contracts". So we could put into this XML file something like this:
This could allow us to code-generate those 8 classes (full class + 7 smaller variations), and have the generator detect that for variation #3, property X, Y and K is present, and I can then tie in either the code for the logic or the interfaces the logic needs into this class automagically. This would allow us to have a number of different types of employee classes, with varying degrees of property coverage, and have the generator automatically add all logic that would be supported by this class to it.
My code could then say that I need an employee of type IEmployeeWithAddressAndPhoneNumbers.
This too looks clunky.
I would suggest that eventually a database refactoring (normalization) is probably in order. You could work on the refactoring and use views to provide the legacy application with an interface to the database consistent with what it expects. That is, for example, break the employe table down in to employee_info, employee_contact_info, employee_assignments, and then provide the legacy application with a view named employee that does a join across these three tables (or maybe a table-based function if the logic is more complex). This would potentially allow you to move ahead with a fully ORM-based solution which is what I would prefer and keep your legacy application happy. I would not proceed with a mixed solution of ORM/direct SQL, although you might be able to augment your ORM by having some entity classes which provide different views of the same data (say a join across a couple of tables for read-only display).
我建议最终数据库重构(规范化)可能是有序的。您可以使用重构和使用视图来为遗留应用程序提供与数据库接口一致的接口。也就是说,例如,将employees表分解为employee_info,employee_contact_info,employee_assignments,然后为遗留应用程序提供名为employee的视图,该视图在这三个表之间进行连接(如果逻辑是,则可能是基于表的函数)更复杂)。这可能会让您继续使用完全基于ORM的解决方案,这是我希望的并保持您的遗留应用程序的快乐。我不会继续使用ORM / direct SQL的混合解决方案,尽管您可以通过提供一些实体类来扩充ORM,这些实体类提供相同数据的不同视图(例如,跨几个表的连接以进行只读显示)。
"We can not, at the present time, break up the tables. The legacy application will still have to exist for a number of years due to the size of the port, and the .NET code is not a in-3-years-release type of project but will be phased in in releases along the way. As such, both the legacy system and the .NET code need to work with the same tables."
“目前我们不能打破这些表格。由于端口的大小,遗留应用程序仍然需要存在多年,并且.NET代码不是3年 - 发布类型的项目,但将在发布过程中分阶段进行。因此,遗留系统和.NET代码都需要使用相同的表。“
Two words: materialized views.
You have several ways of "normalizing in place".
Materialized Views, a/k/a indexed views. This is a normalized clone of your source tables.
物化视图,a / k / a索引视图。这是源表的规范化克隆。
Explicit copying from old tables to new tables. "Ick" you say. However, consider that you'll be incrementally removing functionality from the old app. That means that you'll have some functionality in new, normalized tables, and the old tables can be gracefully ignored.
Explicit 2-way synch. This is hard, not not impossible. You normalize via copy from your legacy tables to correctly designed tables. You can -- as a temporary solution -- use Stored Procedures and Triggers to clone transactions into the legacy tables. You can then retire these kludges as your conversion proceeds.
明确的双向同步。这很难,并非不是不可能。您可以通过复制从旧表到正确设计的表进行规范化。您可以 - 作为临时解决方案 - 使用存储过程和触发器将事务克隆到旧表中。然后,您可以在转换过程中淘汰这些kludges。
You'll be happiest to do this in two absolutely distinct schemas. Since the old database probably doesn't have a well-designed schema, your new database will have one or more named schema so that you can maintain some version control over the definitions.
Although I haven't used this particular ORM, views can be useful in some cases in providing lighter-weight objects for display and reporting in these types of databases. According to their documentation they do support such a concept: XPView Concepts