18 Replies Latest reply on Oct 24, 2011 6:30 PM by RJKSolutions

# Get the Archive - Files

Is there the possiblity to get a list of the archive files?

What i wan't to do is to create a job / task who reprocesses the all archives which are "ready for reprocessing"

=> How do i get the list of archives ?

PS: i also create a feature request .. jobs like this should be provided by osisoft (#: 296881)

Regards
Wolfgang

• ###### Re: Get the Archive - Files

Hello!

We have implemented this, but very ugly way -- run PIARTOOL with appropriate switches and analyse (parse) the output from this utility. I've asked Support, but unfortunately currently it's no way to do this other way. PI SMT (Archive plug-in) uses some secret OSIsoft technology to get archive list from PI...

• ###### Re: Get the Archive - Files

Thanks for the input..

• ###### Re: Get the Archive - Files

Hi Wolfgang,

Are you considering to get the list of archive files programmatically using PI SDK (since this post in located in this forum). This information is not exposed from PI SDK at this moment.

Sergey Bannikov

We have implemented this, but very ugly way -- run PIARTOOL with appropriate switches and analyse (parse) the output from this utility. I've asked Support, but unfortunately currently it's no way to do this other way. PI SMT (Archive plug-in) uses some secret OSIsoft technology to get archive list from PI...

I am not sure whats the "secret" method that is used to extract that information in PI SMT, but another command that you can look is "pidiag -ad". This command dumps the information from the archive manager data file (or sometimes referred to as archive registry). The output from the dump is something like:
Archive manager data file version is 2Archive primary archive code is 1Archive manager data file dump follows:PInt[$Workfile: pinttmpl.cxx$ $Revision: 14$]::  Table contains 4 entries, with 4 total slots allocated.  Extend size is 8 slots and the next iCode is 5.  1. C:\Program Files\PI\dat\piarch.001  2. C:\Program Files\PI\dat\piarch.002  3. C:\Program Files\PI\dat\piarch.003  4. C:\Program Files\PI\dat\piarchHist.001  Alphabetical index:  C:\Program Files\PI\dat\piarch.001  C:\Program Files\PI\dat\piarch.002  C:\Program Files\PI\dat\piarch.003  C:\Program Files\PI\dat\piarchHist.001End of Dump

There are still unnecessary output as you can see but part of the output gives the full file path of the registered archive files as a list.

Another common practise that I see is to have a seperate folder for archive files. I do that when I perform installation for customers as well. In this case it would be possible to work with all available archive files in that folder.

• ###### Re: Get the Archive - Files

Han Yong

Are you considering to get the list of archive files programmatically using PI SDK (since this post in located in this forum). This information is not exposed from PI SDK at >this moment.

Yes i am.

Han Yong

Another common practise that I see is to have a seperate folder for archive files. I do that when I perform installation for customers as well. In this case it would be >possible to work with all available archive files in that folder.

I don't really understand what you mean (... the archive files are created automatically.)

=> i also try to get a feature request ... the reprocessing of the archives should be done automatically.

• ###### Re: Get the Archive - Files

I just mean that if we have a dedicated folder for archive files, instead of the default location \pi\dat\ folder, it would be easy to find all the archive files and list them with other means.

• ###### Re: Get the Archive - Files

Wolfgang Purrer

PS: i also create a feature request .. jobs like this should be provided by osisoft (#: 296881)

Wolfgang Purrer

=> i also try to get a feature request ... the reprocessing of the archives should be done automatically.

Wolfgang,

There is more than one kind of sitation where you would want or need to get your archives reprocessed:

1. get a healthy version of a corrupt archive
2. get a dynamic version of a full fixed archive (or a larger fixed archive), so that it can accomodate more events
3. merge archives
4. split archives

If the required enhancements were implemented in the PI system by OSIsoft, situations 1 and 2 could be detected automatically by the PI Server core subsystems, but it seems to me like situations 3 and 4 are more likely to be to the system manager's discretion.

What exact situations are you thinking of? What exact kind of issues/requests did you raise on case #296881? I'm asking because I am not an OSIsoft employee, so I don't have access to your call history.

• ###### Re: Get the Archive - Files

Daniel Takara

get a dynamic version of a full fixed archive (or a larger fixed archive), so that it can accomodate more events

In the newer version of PI server, the feature to do this (auto-convert full fixed archive to dynamic archive) has been implemented.

Daniel Takara

@Wolfgang: As Daniel mentioned, not all the people in the community would be able to view the call records, it would be a good idea to share what you are situation you are facing exactly. I took a quick look at the call. It seems like you mentioned that the archive that are not reprocessed are slow, whats the performance increase that you see before and after reprocessing?

• ###### Re: Get the Archive - Files

Han Yong

In the newer version of PI server, the feature to do this (auto-convert full fixed archive to dynamic archive) has been implemented.

That's correct:

Issue No.                           6426OSI8

 Product PI3 Module Archive Found in Version 3.2.332.00 Fixed in Version 3.4.375.30 Targeted in Version (Fixed) Severity 1-High

Summary
Enhancements to fixed-size archive files when data is overflowing

Description
A new type of dynamic archive files (auto dynamic) has been introduced to mitigate the risks of data loss when fixed-size archives become full.
The Archive Subsystem will automatically convert fixed-size archives to dynamic when they become full. Once converted, these new archives can grow to a maximum of 2TB, unless the volume is running low on disk space. A message from "piarcmgr" will be send to the PI Message Log to indicate a successful conversion. Here is an example of such message:
Archive C:\pi\arc\arc.004 is 100% full (id: 15)
[-30405] Fixed-size archive was automatically converted to dynamic type

Note: this feature can be disabled by setting the parameter "Archive_EnableAutoDynamic" to a value of "0" (zero) in the pitimeout table (SMT Tuning Parameters).

• ###### Re: Get the Archive - Files

The reason is speed, after reprocessing the data in file is sorted regarding to point not regarding to time. => and so the Query on the archives are a lot faster.

• ###### Re: Get the Archive - Files

I am pretty sure that is not the case.

Looking at the [System Manager I] course materials, the archives are organised as a set of 1k records, each record containing a point ID (or possibly record ID) and then a time ordered set of events for that point. Every tag on the system will have at least 1 of these records. This is the primary record for the tag. If and when the primary record is filled up, an 'overflow' record is created and the process continues (actually, the primary record is converted into a list of pointers to overflow records to speed up data retrieval.

So archives, whether processed or not, are 'sorted' by tag and, within each tag, by time.

Can someone from OSI confirm this and also expand on what benefits reprocessing a healthy archive can bring?

Regards,

--- Alistair.

• ###### Re: Get the Archive - Files

Alistair Frith

So archives, whether processed or not, are 'sorted' by tag and, within each tag, by time.

Alistair is right this.

What I know is that reprocessing would have the following effect:

• compacting the archive file (putting 1/2 filled record together, removing records of deleted tags)
• resolve any corrupted records in the archive file

For most cases, there would not be significant improvements just by reprocessing the archive file. Perhaps there is something different that Wolfgang is facing here.

@Wolfgang: I am interested to find out how much performance difference you are seeing though and how this is quantified. We can take a look at the archive's internal structure by doing an archive dump ("\pi\adm\pidiag -archk 'path to archive file' complete > archkoutput.txt"). We can do it before and after reprocessing to compare the difference.

I would recommend bring this discussion to another forum post since it would not be related to PI SDK programming anymore.

• ###### Re: Get the Archive - Files

Although this thread is a bit stale, I want to clarify the following in case others see this later.

Reprocessing archives essentially defrags them.  This does not remove any data, and it often has improved performance by up to 10x for some customers.  In other words, this would be a good approach for improving performance of large historical queries, though YMMV as the say.

Note that we mention archive reprocessing in the PI System Tuning and Optimization webinar.

BTW, we are looking into ways to eliminate the need to reprocess archives for backfilling data, and also automate the reprocessing of archives for performance reasons in PI Server 2012...

• ###### Re: Get the Archive - Files

Jay Lakumb

BTW, we are looking into ways to eliminate the need to reprocess archives for backfilling data, and also automate the reprocessing of archives for performance reasons in PI Server 2012...

Will it be expanded to Collective management or still only concerned with the individual PI server instance regardless if it is part of a collective?  For example, reprocessing old archives on the Primary triggers the same management task on the other collective members?  Unfortunately we get out of sync archive files as it is so I guess I answered my own question...

What are the performance impacts (if any) of reprocessing an old archive that was created when the system had 50,000 tags and the same system now has >= 500,000 tags?  If your not going to backfill for the archive's timespan won't there be 450,000 unused primary records added to that archive?  Multiple by the number of years worth of archives since system growth, would there be an historical data retrieval impact?

• ###### Re: Get the Archive - Files

Something else...the PI server should be more self checking too, for example periodically performing integrity checks of archives (maybe after an archive shift) to check for records with high indices and report those, even with 'suggestions' for resizing archive files, compression settings, etc.  Obviously a PI Collective should have the option to compare PI Collective members archive integrity as a whole.

Any intentions to provide a GUI like the 'PI Diagnostics Client' that is able to perform maintenance or administration tasks without having to remember all the various offline utilities and command line parameters?  PI SMT only scratches the surface.  PI SMT with links to Powershell to invoke the utilities and report the output and present options for further processing..?

• ###### Re: Get the Archive - Files

One last thing, I promise.

Isn't this (my post above) an area that point partitioning would help where within your archives you have a set of tags that are updated far more rapidly than all other tags?  In this situation the PI server (maybe manual process?) would be able to move those rapidly updating tags to their own archive file whilst the remaining points stay in the 'normal' archive file.  The PI server/you would keep more efficient archives.

• ###### Re: Get the Archive - Files

I would also like to be able to get a list of archives using PI-SDK or PI-OLEDB.  We keep some archives local for performance but after a certain length of time we move them to a share with higher storage capacity.  We had a custom application written to do this for us that takes a dump of the archives using piartool -al and parsing it.

Since we have a lot of data and use collectives now I'm facing a scenario of having 3 copies of the same data.  This really isn't necessary since you can share archives.  I'd like to write a program similar to the one we have that moves the archives, but that also sets the archive to read only at the same time.  This way the same archive file can be used by multiple PI servers.  I think there are lots of practical reasons to make this list available via the SDK.

Matt

• ###### Re: Get the Archive - Files

It turns out this is a scenario which we tested with new PowerShell cmdlets in development.  We wrote a sample script to do this as proof of concept, and it works marvelously.  The script can also be run remotely.  We are planning to unveil the PowerShell cmdlets at vCampus Live! this year, and are going to try and make those available for download on vCampus soon.  Although this uses scripting instead of coding with PI SDK/PI OLEDB, hopefully you find it meets your needs.

• ###### Re: Get the Archive - Files

Jay Lakumb

It turns out this is a scenario which we tested with new PowerShell cmdlets in development.  We wrote a sample script to do this as proof of concept, and it works marvelously.  The script can also be run remotely.  We are planning to unveil the PowerShell cmdlets at vCampus Live! this year, and are going to try and make those available for download on vCampus soon.  Although this uses scripting instead of coding with PI SDK/PI OLEDB, hopefully you find it meets your needs.

Rhys +1'd this