Class ForcedMetadataConnector

  • All Implemented Interfaces:
    org.apache.manifoldcf.agents.interfaces.IPipelineConnector, org.apache.manifoldcf.agents.interfaces.ITransformationConnector, org.apache.manifoldcf.core.interfaces.IConnector

    public class ForcedMetadataConnector
    extends org.apache.manifoldcf.agents.transformation.BaseTransformationConnector
    This connector works as a transformation connector, and merely adds specified metadata items.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static java.lang.String _rcsid  
      static java.lang.String ATTRIBUTE_PARAMETER  
      static java.lang.String ATTRIBUTE_SOURCE  
      static java.lang.String ATTRIBUTE_TARGET  
      static java.lang.String ATTRIBUTE_VALUE  
      static java.lang.String NODE_EXPRESSION  
      static java.lang.String NODE_FIELDMAP  
      static java.lang.String NODE_FILTEREMPTY  
      static java.lang.String NODE_KEEPMETADATA  
      static java.lang.String NODE_PAIR  
      • Fields inherited from class org.apache.manifoldcf.core.connector.BaseConnector

        currentContext, params
      • Fields inherited from interface org.apache.manifoldcf.agents.interfaces.IPipelineConnector

        DOCUMENTSTATUS_ACCEPTED, DOCUMENTSTATUS_REJECTED
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      int addOrReplaceDocumentWithException​(java.lang.String documentURI, org.apache.manifoldcf.core.interfaces.VersionContext pipelineDescription, org.apache.manifoldcf.agents.interfaces.RepositoryDocument document, java.lang.String authorityNameString, org.apache.manifoldcf.agents.interfaces.IOutputAddActivity activities)
      Add (or replace) a document in the output data store using the connector.
      protected static boolean allDates​(IDataSource[] dataSources)  
      protected static boolean allReaders​(IDataSource[] dataSources)  
      protected static IDataSource append​(IDataSource currentValues, IDataSource data)  
      protected static java.lang.Object[] conditionallyRemoveNulls​(java.lang.Object[] input, boolean filterEmpty)  
      protected static void fillInExpressionsTab​(java.util.Map<java.lang.String,​java.lang.Object> paramMap, org.apache.manifoldcf.core.interfaces.Specification os)  
      java.lang.String getFormCheckJavascriptMethodName​(int connectionSequenceNumber)
      Obtain the name of the form check javascript method to call.
      java.lang.String getFormPresaveCheckJavascriptMethodName​(int connectionSequenceNumber)
      Obtain the name of the form presave check javascript method to call.
      org.apache.manifoldcf.core.interfaces.VersionContext getPipelineDescription​(org.apache.manifoldcf.core.interfaces.Specification spec)
      Get a pipeline version string, given a pipeline specification object.
      protected static void moveData​(org.apache.manifoldcf.agents.interfaces.RepositoryDocument docCopy, java.lang.String target, FieldDataFactory document, java.lang.String field, boolean filterEmpty)  
      protected static java.lang.String nonExpressionEscape​(java.lang.String input)
      This is used to upgrade older constant values to new ones, that won't trigger expression eval.
      protected static java.lang.String nonExpressionUnescape​(java.lang.String input)
      This is used to unescape text that's been escaped to prevent substitution of ${} expressions.
      void outputSpecificationBody​(org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification os, int connectionSequenceNumber, int actualSequenceNumber, java.lang.String tabName)
      Output the specification body section.
      void outputSpecificationHeader​(org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification os, int connectionSequenceNumber, java.util.List<java.lang.String> tabsArray)
      Output the specification header section.
      protected static int parseArgument​(java.lang.String input, int start, java.lang.StringBuilder output)  
      protected static int parseToEnd​(java.lang.String input, int start)  
      static IDataSource processExpression​(java.lang.String expression, FieldDataFactory sourceDocument)  
      java.lang.String processSpecificationPost​(org.apache.manifoldcf.core.interfaces.IPostParameters variableContext, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification os, int connectionSequenceNumber)
      Process a specification post.
      protected static java.lang.String[] removeEmpties​(java.lang.String[] input)  
      void viewSpecification​(org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification os, int connectionSequenceNumber)
      View specification.
      • Methods inherited from class org.apache.manifoldcf.agents.transformation.BaseTransformationConnector

        checkDateIndexable, checkDocumentIndexable, checkLengthIndexable, checkMimeTypeIndexable, checkURLIndexable, getActivitiesList, requestInfo
      • Methods inherited from class org.apache.manifoldcf.core.connector.BaseConnector

        check, clearThreadContext, connect, deinstall, disconnect, getConfiguration, install, isConnected, outputConfigurationBody, outputConfigurationBody, outputConfigurationHeader, outputConfigurationHeader, outputConfigurationHeader, pack, packFixedList, packList, packList, poll, processConfigurationPost, processConfigurationPost, setThreadContext, unpack, unpackFixedList, unpackList, viewConfiguration, viewConfiguration
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
      • Methods inherited from interface org.apache.manifoldcf.core.interfaces.IConnector

        check, clearThreadContext, connect, deinstall, disconnect, getConfiguration, install, isConnected, outputConfigurationBody, outputConfigurationHeader, poll, processConfigurationPost, setThreadContext, viewConfiguration
    • Constructor Detail

      • ForcedMetadataConnector

        public ForcedMetadataConnector()
    • Method Detail

      • getPipelineDescription

        public org.apache.manifoldcf.core.interfaces.VersionContext getPipelineDescription​(org.apache.manifoldcf.core.interfaces.Specification spec)
                                                                                    throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
                                                                                           org.apache.manifoldcf.agents.interfaces.ServiceInterruption
        Get a pipeline version string, given a pipeline specification object. The version string is used to uniquely describe the pertinent details of the specification and the configuration, to allow the Connector Framework to determine whether a document will need to be processed again. Note that the contents of any document cannot be considered by this method; only configuration and specification information can be considered. This method presumes that the underlying connector object has been configured.
        Specified by:
        getPipelineDescription in interface org.apache.manifoldcf.agents.interfaces.IPipelineConnector
        Overrides:
        getPipelineDescription in class org.apache.manifoldcf.agents.transformation.BaseTransformationConnector
        Parameters:
        spec - is the current pipeline specification object for this connection for the job that is doing the crawling.
        Returns:
        a string, of unlimited length, which uniquely describes configuration and specification in such a way that if two such strings are equal, nothing that affects how or whether the document is indexed will be different.
        Throws:
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
        org.apache.manifoldcf.agents.interfaces.ServiceInterruption
      • addOrReplaceDocumentWithException

        public int addOrReplaceDocumentWithException​(java.lang.String documentURI,
                                                     org.apache.manifoldcf.core.interfaces.VersionContext pipelineDescription,
                                                     org.apache.manifoldcf.agents.interfaces.RepositoryDocument document,
                                                     java.lang.String authorityNameString,
                                                     org.apache.manifoldcf.agents.interfaces.IOutputAddActivity activities)
                                              throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
                                                     org.apache.manifoldcf.agents.interfaces.ServiceInterruption,
                                                     java.io.IOException
        Add (or replace) a document in the output data store using the connector. This method presumes that the connector object has been configured, and it is thus able to communicate with the output data store should that be necessary. The OutputSpecification is *not* provided to this method, because the goal is consistency, and if output is done it must be consistent with the output description, since that was what was partly used to determine if output should be taking place. So it may be necessary for this method to decode an output description string in order to determine what should be done.
        Specified by:
        addOrReplaceDocumentWithException in interface org.apache.manifoldcf.agents.interfaces.IPipelineConnector
        Overrides:
        addOrReplaceDocumentWithException in class org.apache.manifoldcf.agents.transformation.BaseTransformationConnector
        Parameters:
        documentURI - is the URI of the document. The URI is presumed to be the unique identifier which the output data store will use to process and serve the document. This URI is constructed by the repository connector which fetches the document, and is thus universal across all output connectors.
        pipelineDescription - is the description string that was constructed for this document by the getOutputDescription() method.
        document - is the document data to be processed (handed to the output data store).
        authorityNameString - is the name of the authority responsible for authorizing any access tokens passed in with the repository document. May be null.
        activities - is the handle to an object that the implementer of a pipeline connector may use to perform operations, such as logging processing activity, or sending a modified document to the next stage in the pipeline.
        Returns:
        the document status (accepted or permanently rejected).
        Throws:
        java.io.IOException - only if there's a stream error reading the document data.
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
        org.apache.manifoldcf.agents.interfaces.ServiceInterruption
      • allDates

        protected static boolean allDates​(IDataSource[] dataSources)
                                   throws java.io.IOException,
                                          org.apache.manifoldcf.core.interfaces.ManifoldCFException
        Throws:
        java.io.IOException
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
      • allReaders

        protected static boolean allReaders​(IDataSource[] dataSources)
                                     throws java.io.IOException,
                                            org.apache.manifoldcf.core.interfaces.ManifoldCFException
        Throws:
        java.io.IOException
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
      • moveData

        protected static void moveData​(org.apache.manifoldcf.agents.interfaces.RepositoryDocument docCopy,
                                       java.lang.String target,
                                       FieldDataFactory document,
                                       java.lang.String field,
                                       boolean filterEmpty)
                                throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
                                       java.io.IOException
        Throws:
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
        java.io.IOException
      • removeEmpties

        protected static java.lang.String[] removeEmpties​(java.lang.String[] input)
      • conditionallyRemoveNulls

        protected static java.lang.Object[] conditionallyRemoveNulls​(java.lang.Object[] input,
                                                                     boolean filterEmpty)
      • getFormCheckJavascriptMethodName

        public java.lang.String getFormCheckJavascriptMethodName​(int connectionSequenceNumber)
        Obtain the name of the form check javascript method to call.
        Specified by:
        getFormCheckJavascriptMethodName in interface org.apache.manifoldcf.agents.interfaces.IPipelineConnector
        Overrides:
        getFormCheckJavascriptMethodName in class org.apache.manifoldcf.agents.transformation.BaseTransformationConnector
        Parameters:
        connectionSequenceNumber - is the unique number of this connection within the job.
        Returns:
        the name of the form check javascript method.
      • getFormPresaveCheckJavascriptMethodName

        public java.lang.String getFormPresaveCheckJavascriptMethodName​(int connectionSequenceNumber)
        Obtain the name of the form presave check javascript method to call.
        Specified by:
        getFormPresaveCheckJavascriptMethodName in interface org.apache.manifoldcf.agents.interfaces.IPipelineConnector
        Overrides:
        getFormPresaveCheckJavascriptMethodName in class org.apache.manifoldcf.agents.transformation.BaseTransformationConnector
        Parameters:
        connectionSequenceNumber - is the unique number of this connection within the job.
        Returns:
        the name of the form presave check javascript method.
      • outputSpecificationHeader

        public void outputSpecificationHeader​(org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
                                              java.util.Locale locale,
                                              org.apache.manifoldcf.core.interfaces.Specification os,
                                              int connectionSequenceNumber,
                                              java.util.List<java.lang.String> tabsArray)
                                       throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
                                              java.io.IOException
        Output the specification header section. This method is called in the head section of a job page which has selected a pipeline connection of the current type. Its purpose is to add the required tabs to the list, and to output any javascript methods that might be needed by the job editing HTML.
        Specified by:
        outputSpecificationHeader in interface org.apache.manifoldcf.agents.interfaces.IPipelineConnector
        Overrides:
        outputSpecificationHeader in class org.apache.manifoldcf.agents.transformation.BaseTransformationConnector
        Parameters:
        out - is the output to which any HTML should be sent.
        locale - is the preferred local of the output.
        os - is the current pipeline specification for this connection.
        connectionSequenceNumber - is the unique number of this connection within the job.
        tabsArray - is an array of tab names. Add to this array any tab names that are specific to the connector.
        Throws:
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
        java.io.IOException
      • outputSpecificationBody

        public void outputSpecificationBody​(org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
                                            java.util.Locale locale,
                                            org.apache.manifoldcf.core.interfaces.Specification os,
                                            int connectionSequenceNumber,
                                            int actualSequenceNumber,
                                            java.lang.String tabName)
                                     throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
                                            java.io.IOException
        Output the specification body section. This method is called in the body section of a job page which has selected a pipeline connection of the current type. Its purpose is to present the required form elements for editing. The coder can presume that the HTML that is output from this configuration will be within appropriate <html>, <body>, and <form> tags. The name of the form is "editjob".
        Specified by:
        outputSpecificationBody in interface org.apache.manifoldcf.agents.interfaces.IPipelineConnector
        Overrides:
        outputSpecificationBody in class org.apache.manifoldcf.agents.transformation.BaseTransformationConnector
        Parameters:
        out - is the output to which any HTML should be sent.
        locale - is the preferred local of the output.
        os - is the current pipeline specification for this job.
        connectionSequenceNumber - is the unique number of this connection within the job.
        actualSequenceNumber - is the connection within the job that has currently been selected.
        tabName - is the current tab name.
        Throws:
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
        java.io.IOException
      • processSpecificationPost

        public java.lang.String processSpecificationPost​(org.apache.manifoldcf.core.interfaces.IPostParameters variableContext,
                                                         java.util.Locale locale,
                                                         org.apache.manifoldcf.core.interfaces.Specification os,
                                                         int connectionSequenceNumber)
                                                  throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
        Process a specification post. This method is called at the start of job's edit or view page, whenever there is a possibility that form data for a connection has been posted. Its purpose is to gather form information and modify the transformation specification accordingly. The name of the posted form is "editjob".
        Specified by:
        processSpecificationPost in interface org.apache.manifoldcf.agents.interfaces.IPipelineConnector
        Overrides:
        processSpecificationPost in class org.apache.manifoldcf.agents.transformation.BaseTransformationConnector
        Parameters:
        variableContext - contains the post data, including binary file-upload information.
        locale - is the preferred local of the output.
        os - is the current pipeline specification for this job.
        connectionSequenceNumber - is the unique number of this connection within the job.
        Returns:
        null if all is well, or a string error message if there is an error that should prevent saving of the job (and cause a redirection to an error page).
        Throws:
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
      • viewSpecification

        public void viewSpecification​(org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
                                      java.util.Locale locale,
                                      org.apache.manifoldcf.core.interfaces.Specification os,
                                      int connectionSequenceNumber)
                               throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
                                      java.io.IOException
        View specification. This method is called in the body section of a job's view page. Its purpose is to present the pipeline specification information to the user. The coder can presume that the HTML that is output from this configuration will be within appropriate <html> and <body>tags.
        Specified by:
        viewSpecification in interface org.apache.manifoldcf.agents.interfaces.IPipelineConnector
        Overrides:
        viewSpecification in class org.apache.manifoldcf.agents.transformation.BaseTransformationConnector
        Parameters:
        out - is the output to which any HTML should be sent.
        locale - is the preferred local of the output.
        connectionSequenceNumber - is the unique number of this connection within the job.
        os - is the current pipeline specification for this job.
        Throws:
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
        java.io.IOException
      • fillInExpressionsTab

        protected static void fillInExpressionsTab​(java.util.Map<java.lang.String,​java.lang.Object> paramMap,
                                                   org.apache.manifoldcf.core.interfaces.Specification os)
      • nonExpressionEscape

        protected static java.lang.String nonExpressionEscape​(java.lang.String input)
        This is used to upgrade older constant values to new ones, that won't trigger expression eval.
      • nonExpressionUnescape

        protected static java.lang.String nonExpressionUnescape​(java.lang.String input)
        This is used to unescape text that's been escaped to prevent substitution of ${} expressions.
      • append

        protected static IDataSource append​(IDataSource currentValues,
                                            IDataSource data)
                                     throws java.io.IOException,
                                            org.apache.manifoldcf.core.interfaces.ManifoldCFException
        Throws:
        java.io.IOException
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
      • processExpression

        public static IDataSource processExpression​(java.lang.String expression,
                                                    FieldDataFactory sourceDocument)
                                             throws java.io.IOException,
                                                    org.apache.manifoldcf.core.interfaces.ManifoldCFException
        Throws:
        java.io.IOException
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
      • parseArgument

        protected static int parseArgument​(java.lang.String input,
                                           int start,
                                           java.lang.StringBuilder output)
      • parseToEnd

        protected static int parseToEnd​(java.lang.String input,
                                        int start)