Ask Your Question
0

What causes the Coordinating Node to show a different number of data and metadata objects than a Member Node?

asked 2013-07-17 18:00:16 -0500

Sometimes, when looking for Member Node (MN) content on the Coordinating Node (CN), there can be a discrepancy in the count of objects that persists even after synchronization has run. For example, a Member Node might be reporting that it contains 1017 FGDC metadata objects, but the CN might only list 965. Or, the CN might be reporting more objects than the MN. What are the major causes, and how does a Member Node operator avoid these discrepancies?

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
2

answered 2013-07-17 18:27:59 -0500

updated 2013-07-17 19:16:59 -0500

These discrepancies occur when either: synchronization is not functioning properly, so the MN and CN have not synced their content since the MN made changes, or if the MN makes changes without properly notifying the CN. There are two cases to consider:

Case 1: Fewer objects on CN than MN: check <dateSysMetadataModified>

In this case, the likely problem is that the CN is not discovering objects on the MN during the synchronization process. This can happen if the MN inserts some objects, but fails to set the <dateSysMetadataModified> field to the current time. Some MNs back-date these system metadata modification times, not realizing that the CN relies on that date to determine which objects might have changed and need to be synced. If the <dateSysMetadataModified> date is set to a date that is before the most recent synchronization time, then that object will never be noticed by the CN, and never harvested. To fix this, modify the system metadata and set the <dateSysMetadataModified> field to the time now, and the next time the synchronization occurs, the new object changes will be noticed and picked up.

Case 2: More objects on CN than MN: ensure archive and obsoletes/obsoleted are set

In this case, some objects are on the CN which are not present on the MN. The only way for this to happen is if the MN has removed the content that was previously associated with a particular identifier (often in the process of trying to change an identifier). The solution is simple as well -- remember that identifiers (pids) are both persistent and non-reusable, so if you want to remove an identifier from your system, first insert the content using the new identifier, being sure to reference the old identifier in the <obsoletes> system metadata field. Then, delete the old identifier, but be sure to keep its system metadata, reference the new pid in the <obsoletedBy> field, and mark the object as <archived>true</archived>. This will tell the CN that the object identified by the PID is no longer active on the MN, and that it has been replaced by the new PID. This will allow the CN to properly understand and index the changes made on the MN.

See the documentation on SystemMetadata for more details: http://mule1.dataone.org/ArchitectureDocs-current/apis/Types.html#Types.SystemMetadata

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

[hide preview]

Question Tools

Stats

Asked: 2013-07-17 18:00:16 -0500

Seen: 273 times

Last updated: Jul 17 '13