(關系)日期/時間點/間隔的數據庫性能

[英](relational) database performance for a date/time point/interval


So I am doing a project in Access SQL and it has come along nicely. I have learned a lot about Access and VBA and this site has been helpful in the process.

所以我在Access SQL中做了一個項目,它已經很好了。我已經學到了很多關於Access和VBA的知識,這個網站在這個過程中很有幫助。

Now I am facing a problem which is performance and since I have little experience in this kind of SQL work I come here for some thoughts.

現在我面臨的問題是性能問題,因為我在這種SQL工作方面沒什么經驗,所以我來這里是為了一些想法。

I have a ~20 table relational database for around 100 sections which represent parts of a route. The Access database is essentially a map on which I drew several routes (via lines) that can be coloured dynamically - the color is determined by the specific question and calculated out of the database.

我有一個約20個表關系數據庫,用於表示路徑的大約100個部分。 Access數據庫本質上是一個地圖,我在其上繪制了幾條可以動態着色的路線(通過線路) - 顏色由特定問題確定並從數據庫中計算出來。

Here is a picture which explains it better. You can not click on lines in access so the buttons are set to be identical in colour and width to the lines and are clickable for more information. a thing

這是一張更好地解釋它的圖片。您無法單擊訪問中的行,因此按鈕的顏色和寬度設置為與行相同,並且可單擊以獲取更多信息。

The user can chose a date and it will display the progress of the route according to the question asked. Up to now, these questions were always binary "yes, or no" (green or red).

用戶可以選擇日期,並根據詢問的問題顯示路線的進度。到目前為止,這些問題總是二進制“是或否”(綠色或紅色)。

I have found that because of the complexity of the queries I have to pretty much prepare a temporary database for each query at startup, otherwise it is not possible to scroll through dates smoothly.

我發現由於查詢的復雜性,我必須在啟動時為每個查詢准備一個臨時數據庫,否則無法順利滾動日期。

So anyway here is my specific problem:

所以無論如何這是我的具體問題:

Each section of the route can be in different phases (think construction) at a certain date. From "phase 0" to "done"

路線的每個部分在某個日期可以處於不同的階段(思考建築)。從“階段0”到“完成”

A new line is to be implemented which represents phases of a project. There are around 8 possible phases for all sections, which can happen at different times and - here is the thing - in a different order for each section AND not all phases happen on all sections.

將實施一個代表項目階段的新生產線。所有部分都有大約8個可能的階段,這可能發生在不同的時間 - 這就是事情 - 每個部分的順序不同而且並非所有階段都發生。

What I have in the database are only starting dates - not ending dates - for each phase. The order of the phases has pretty much be determined by the order of the starting date. At least each phase can only happen once for each section, so there is that. As you can see this is a shitty thing for this kind of performance centric program.

我在數據庫中只有每個階段的開始日期 - 而不是結束日期。階段的順序幾乎取決於開始日期的順序。至少每個階段每個部分只能發生一次,所以就是這樣。正如你所看到的,對於這種以性能為中心的程序來說,這是一件很糟糕的事情。

I am certain it will involve one or several temporary databases. My ideas:

我確信它將涉及一個或幾個臨時數據庫。我的想法:

  1. Aggregate all dates into one row of a new table. Since the number of phases is set, there are columns for each phase - if it is needed, when it starts and when it ends. A loop now needs to go through each and check if the user-date falls into which phase. So: "SectionID - phase1needed phase1start phase1end ....."
    Advantage:

    將所有日期聚合到新表的一行中。由於設置了相數,因此每個階段都有列 - 如果需要,則何時開始以及何時結束。循環現在需要遍歷每個循環並檢查用戶日期是否屬於哪個階段。所以:“SectionID - phase1needed phase1start phase1end .....”優勢:

    • One can confirm the data manually and display it in secondary forms well
    • 可以手動確認數據並以次要形式顯示

    • It keeps the database small
      Disadvantage:
    • 它保持數據庫小的缺點:

    • The actual loop needs to go through (At worst) all phases to find the correct one.
    • 實際循環需要經過(最壞的情況)所有階段才能找到正確的循環。

  2. Calculate a new database which is just "IdSection - Date - Phase" and calculate a phase for each Section and EVERY Day in an interval.
    Advantage:

    計算一個新的數據庫,它只是“IdSection - Date - Phase”,並計算一個區間內每個Section和EVERY日的階段。優點:

    • This keeps the runtime calculations to one query per section
    • 這使運行時計算保持每個部分一個查詢

    • Access should work with large amounts of data
      Disadvantage:
    • 訪問應該處理大量數據缺點:

    • I can not manually check if what I did was correct for all sections
    • 我不能手動檢查我所做的是否對所有部分都是正確的

    • Will take long at startup, like really long
    • 在啟動時需要很長時間,就像真的很長

    • It will take a lot of entries in that db
    • 這個db需要很多條目

Now I ask which you would prefer, or even if there is a different method? I can not really change much about the points of data I have.

現在我問你想要哪個,或者即使有不同的方法?我無法真正改變我的數據點。

In short I have to display intervals of time of different phases and in the database I only have starting points of time, no complete order of the phases.

簡而言之,我必須顯示不同階段的時間間隔,在數據庫中我只有時間的起點,沒有完整的階段順序。

Thank you for your thoughts, any experiences in these sort of things will help

感謝您的想法,任何有關這些事情的經驗都會有所幫助

1 个解决方案

#1


1  

If I understand you properly, you have a series of data similar to the form:

如果我理解你,你有一系列類似於表格的數據:

Section 1, Phase 7, Start Date = 11/07/2012
Section 1, Phase 2, Start Date = 12/14/2012
Section 1, Phase 3, Start Date = 12/28/2012
Section 2, Phase 1, Start Date = 11/04/2012
Section 2, Phase 9, Start Date = 12/30/2012
Section 3, Phase 4, Start Date = 11/19/2012
Section 3, Phase 5, Start Date = 12/06/2012
Section 3, Phase 3, Start Date = 12/11/2012

and you want to answer a question like "What phase is each section in on 12/15/2012?", is that correct?

並且你想回答一個問題,例如“12/15/2012每個部分的哪個階段?”,這是正確的嗎?

The answer in this case should look something like the form:

在這種情況下,答案應該類似於以下形式:

Section 1, Phase 2
Section 2, Phase 1
Section 3, Phase 3

In order to do this, I'll assume you have a table called SECTION_PHASES with the following fields:

為了做到這一點,我假設你有一個名為SECTION_PHASES的表,其中包含以下字段:

SECTION    Number
PHASE      Number
START_DATE Date/Time

What you need to do is figure out the maximum start date for each section that happened before your current input date, because that is the most recently active phase before the next phase change. Once you do that, you can join that information back into your main table to determine what the phase was after that date.

您需要做的是計算在當前輸入日期之前發生的每個部分的最大開始日期,因為這是下一個階段變化之前的最近活動階段。完成后,您可以將該信息重新加入主表,以確定該日期之后的階段。

You need to make one query SECTION_MAX_DATES that then has the following code in its SQL View:

您需要創建一個查詢SECTION_MAX_DATES,然后在其SQL視圖中包含以下代碼:

SELECT [SECTION_PHASES].SECTION, Max([SECTION_PHASES].START_DATE) AS target_date
FROM SECTION_PHASES
WHERE [SECTION_PHASES].START_DATE<#12/15/2012#
GROUP BY [SECTION_PHASES].SECTION
ORDER BY [SECTION_PHASES].SECTION;

Once you have that query saved, you can join it as a subquery back to your original table. Now, make another query SECTION_PHASE_AT_DATE which includes your original table and the previous query, then enter the following code in its SQL View:

保存該查詢后,可以將其作為子查詢加入到原始表中。現在,進行另一個查詢SECTION_PHASE_AT_DATE,其中包含您的原始表和上一個查詢,然后在其SQL視圖中輸入以下代碼:

SELECT SECTION_PHASES.SECTION, SECTION_PHASES.PHASE, SECTION_PHASES.START_DATE
FROM SECTION_MAX_DATES INNER JOIN SECTION_PHASES ON (SECTION_MAX_DATES.target_date=SECTION_PHASES.START_DATE) AND (SECTION_MAX_DATES.SECTION=SECTION_PHASES.SECTION)
ORDER BY SECTION_PHASES.SECTION;

That query will give you the result you are after, if I understand your question correctly. There is no need to calculate the end dates if I understand you properly that a new start date for a given phase indicates the end of whatever phase was previously-current prior to the new date.

如果我正確理解您的問題,該查詢將為您提供您所追求的結果。如果我理解你,某個特定階段的新開始日期表明在新日期之前的當前階段結束時,則無需計算結束日期。

You'll still have a few edge cases to work out, like what happens if a section doesn't have a phase registered yet prior to the given date. I'll also leave it to you to figure out how to parameterize the date in the WHERE clause of the 1st of the two queries, which is probably trivial for you given the progress you made already! However, I think this is the SQL structure you were looking for to solve the data/calculation part of your problem.

你仍然會遇到一些邊緣情況,比如如果一個部分在給定日期之前沒有注冊階段會發生什么。我還會告訴你如何在兩個查詢中的第一個的WHERE子句中參數化日期,考慮到你已經取得的進展,這對你來說可能是微不足道的!但是,我認為這是您正在尋找的SQL結構,用於解決問題的數據/計算部分。


注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2012/12/21/72f0de3e8b1d8b8a095d460ae12ba91e.html



 
粤ICP备14056181号  © 2014-2021 ITdaan.com