Skip to content

Judges

Judges score or classify candidates produced by a generator.

All judges inherit from CandidateJudgeBase and implement a judge() method returning CandidateJudgeOutput.

CandidateJudgeBase

locisimiles.pipeline.judge._base.CandidateJudgeBase

Abstract base class for candidate judges.

A judge receives the output of a candidate generator and produces a CandidateJudgeOutput — a dictionary mapping query-segment IDs to lists of CandidateJudge objects, each containing a source segment, the original candidate score, and a final judgment score.

Subclasses must implement judge().

Available implementations:

  • ClassificationJudge — scores pairs with a fine-tuned transformer classification model.
  • ThresholdJudge — applies a top-k or score-threshold rule.
  • IdentityJudge — passes candidates through unchanged (judgment_score = 1.0).

judge abstractmethod

judge(
    *,
    query: Document,
    candidates: CandidateGeneratorOutput,
    **kwargs: Any,
) -> CandidateJudgeOutput

Score or classify candidates.

PARAMETER DESCRIPTION
query

Query document (needed to look up query-segment texts).

TYPE: Document

candidates

Output from a candidate generator.

TYPE: CandidateGeneratorOutput

**kwargs

Judge-specific parameters.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
CandidateJudgeOutput

Mapping of query segment IDs → lists of CandidateJudge objects.

ClassificationJudge

Judge candidates using a transformer sequence-classification model.

locisimiles.pipeline.judge.classification.ClassificationJudge

ClassificationJudge(
    *,
    classification_name: str = "julian-schelb/xlm-roberta-large-class-lat-intertext-v1",
    device: str | int | None = None,
    pos_class_idx: int = 1,
)

Judge candidates using a transformer classification model.

Loads a pre-trained sequence-classification model and tokenizer. For each query–candidate pair the model outputs P(positive), which is stored as judgment_score.

The default model is julian-schelb/xlm-roberta-large-class-lat-intertext-v1, a fine-tuned classifier for Latin intertextuality detection.

PARAMETER DESCRIPTION
classification_name

HuggingFace model identifier.

TYPE: str DEFAULT: 'julian-schelb/xlm-roberta-large-class-lat-intertext-v1'

device

Torch device string ("cpu", "cuda", "mps").

TYPE: str | int | None DEFAULT: None

pos_class_idx

Index of the positive class in the classifier output.

TYPE: int DEFAULT: 1

Example
from locisimiles.pipeline.judge import ClassificationJudge

# Create judge with default model
judge = ClassificationJudge(device="cpu")

# Score pre-generated candidates
results = judge.judge(query=query_doc, candidates=candidates)

# Each result has a judgment_score (probability of being a match)
for qid, judgments in results.items():
    for j in judgments:
        if j.judgment_score > 0.5:
            print(f"{qid}{j.segment.id}: {j.judgment_score:.3f}")

debug_input_sequence

debug_input_sequence(
    query_text: str, candidate_text: str, max_len: int = 512
) -> Dict[str, Any]

Inspect how a query–candidate pair is tokenised and encoded.

Useful for debugging classification results or understanding how text truncation affects model input.

PARAMETER DESCRIPTION
query_text

Raw query text.

TYPE: str

candidate_text

Raw candidate text.

TYPE: str

max_len

Maximum token length.

TYPE: int DEFAULT: 512

RETURNS DESCRIPTION
Dict[str, Any]

Dictionary with keys:

Dict[str, Any]
  • query / candidate — original texts.
Dict[str, Any]
  • query_truncated / candidate_truncated — after truncation.
Dict[str, Any]
  • input_ids — token ID list.
Dict[str, Any]
  • attention_mask — attention mask list.
Dict[str, Any]
  • input_text — decoded input with special tokens visible.
Example
judge = ClassificationJudge(device="cpu")
info = judge.debug_input_sequence(
    "Arma virumque cano",
    "Troiae qui primus ab oris",
)
print(info["input_text"])

judge

judge(
    *,
    query: Document,
    candidates: CandidateGeneratorOutput,
    batch_size: int = 32,
    **kwargs: Any,
) -> CandidateJudgeOutput

Classify each candidate pair using the loaded model.

PARAMETER DESCRIPTION
query

Query document.

TYPE: Document

candidates

Output from a candidate generator.

TYPE: CandidateGeneratorOutput

batch_size

Batch size for the classifier.

TYPE: int DEFAULT: 32

RETURNS DESCRIPTION
CandidateJudgeOutput

CandidateJudgeOutput with judgment_score =

CandidateJudgeOutput

P(positive) from the classifier.

ThresholdJudge

Binary decisions based on candidate scores (top-k or threshold).

locisimiles.pipeline.judge.threshold.ThresholdJudge

ThresholdJudge(
    *,
    top_k: int = 10,
    similarity_threshold: Optional[float] = None,
)

Judge candidates using a simple score threshold or top-k cut-off.

Two strategies are available (mutually exclusive):

  • Top-k (default): the first top_k candidates per query (assumed to be sorted by score descending) receive judgment_score = 1.0; the rest get 0.0.
  • Similarity threshold: if similarity_threshold is provided, every candidate whose score >= similarity_threshold receives judgment_score = 1.0.
PARAMETER DESCRIPTION
top_k

Number of top candidates to mark as positive.

TYPE: int DEFAULT: 10

similarity_threshold

Score threshold for positive decisions. If set, overrides top_k.

TYPE: Optional[float] DEFAULT: None

Example
from locisimiles.pipeline.judge import ThresholdJudge

# Keep the 5 best candidates per query
judge = ThresholdJudge(top_k=5)
results = judge.judge(query=query_doc, candidates=candidates)

# Or use a similarity threshold instead
judge = ThresholdJudge(similarity_threshold=0.7)
results = judge.judge(query=query_doc, candidates=candidates)

judge

judge(
    *,
    query: Document,
    candidates: CandidateGeneratorOutput,
    top_k: Optional[int] = None,
    similarity_threshold: Optional[float] = None,
    **kwargs: Any,
) -> CandidateJudgeOutput

Apply threshold or top-k rule to produce binary judgments.

PARAMETER DESCRIPTION
query

Query document (unused but required by protocol).

TYPE: Document

candidates

Output from a candidate generator.

TYPE: CandidateGeneratorOutput

top_k

Override instance top_k.

TYPE: Optional[int] DEFAULT: None

similarity_threshold

Override instance similarity_threshold.

TYPE: Optional[float] DEFAULT: None

RETURNS DESCRIPTION
CandidateJudgeOutput

CandidateJudgeOutput with judgment_score ∈ {0.0, 1.0}.

IdentityJudge

Pass-through judge that marks every candidate as positive.

locisimiles.pipeline.judge.identity.IdentityJudge

Pass every candidate through with judgment_score = 1.0.

Useful when the candidate generator already performs all the filtering and scoring that is needed (e.g. the rule-based generator). No additional models are loaded.

Example
from locisimiles.pipeline.judge import IdentityJudge

judge = IdentityJudge()
results = judge.judge(query=query_doc, candidates=candidates)

# Every candidate gets judgment_score = 1.0
for qid, judgments in results.items():
    for j in judgments:
        print(j.judgment_score)  # 1.0

judge

judge(
    *,
    query: Document,
    candidates: CandidateGeneratorOutput,
    **kwargs: Any,
) -> CandidateJudgeOutput

Convert every Candidate to CandidateJudge with judgment_score = 1.0.