A 750-bp fragment of a novel human cysteine protease has been identified from the dbEST databank. PCR cloning and DNA sequencing yielded a 1.38-kb full-length cDNA which encodes a polypeptide of 376 amino acids. The protein consists of a putative 21-residue signal peptide, a 106-residue propeptide and a 252-residue mature protein. The deduced amino acid sequence contains the highly conserved residues of the catalytic triad of papain-like cysteine proteases: cysteine, histidine, and asparagine. Furthermore, the protein sequence possesses two potential N-glycosylation sites: one in the propeptide and one in the mature protein. Comparison of the amino acid sequence of human cathepsin W with other human thiol-dependent cathepsins revealed a relatively low degree of similarity (21-31%). In contrast to cathepsins L, S, K, B, H and O, cathepsin W contains a 21-amino acid peptide insertion between the putative active site histidine and asparagine residues and an 8-amino acid C-terminal extension. This unique sequence may indicate that cathepsin W belongs in a novel subgroup of papain-like proteases distinct from that of cathepsin L- and B-like proteases. Northern blot analysis indicates a specific expression of cathepsin W in lymphatic tissues. Further analysis revealed predominant levels of expression in T-lymphocytes, and more specifically in CD8+ cells. The expression of the protease in cytotoxic T-lymphocytes may suggest a specific function in the mechanism or regulation of T-cell cytolytic activity.