Introduction

The search for genes that confer susceptibility to schizophrenia has yielded three waves of research each of which has provided strong incremental support for the hypothesis that genetic factors are pivotally implicated in the etiology of this severe neuropsychiatric disorder. The first was an extensive series of family, twin and adoption studies.1, 2, 3 These clearly demonstrated a genetic component that might contribute up to 80% of the variance but could not clearly define the mode of inheritance.4 With the advent of powerful molecular techniques, a wave of linkage studies and whole genome scans followed. While results have not been fully consistent across scans, a confluence of findings for certain chromosomal regions is gradually emerging.5, 6, 7 Two meta-analyses8, 9 found significant support for linkage to certain chromosomal regions although these differed between the two reports, possibly due to methodological disparities (Badner and Gershon8 – 8p21–22, 13q14–32 and 22q11–13; Lewis et al9 – 2q). The third wave in the search for schizophrenia susceptibility genes has focused on identifying specific genes, mostly in regions previously implicated by linkage studies. There have been several pivotal findings that have been independently replicated, most prominently DTNBP1, Neuregulin 1, G72 and DAAO, RGS410, 11 and DISC1.12, 13 While these genes have potential pathophysiological relevance to schizophrenia, it is noteworthy that pathogenic mutations or genetic variants that influence function by other mechanisms have not yet been identified. The possibility remains that other, as yet undiscovered genes, that are in linkage disequilibrium with the loci that have been identified, are in fact implicated. Moreover, additional genes in other regions are likely to be implicated in the pathophysiology of the disorder.

The genome scan that was the starting point of the present study was performed by Lerer et al14 in a sample of Arab Israeli families multiply affected with schizophrenia. We applied multipoint nonparametric and parametric linkage analyses to the genotypes of 155 individuals from 21 families at 347 microsatellite markers distributed over the 22 autosomes. The strongest linkage was at chromosome 6q23 under a broad diagnostic category. The NPL at marker D6S292 (located 136.97 cM from the pter) was 4.60 (P=0.000004). The multipoint parametric LOD score under a dominant model and homogeneity was 3.33 (dominant model). Under the core diagnostic category the NPL was 4.29 (P=0.00001) and the LOD score 4.16 (dominant model). The region defined by a decrease of 1.0 in the NPL (broad diagnostic model) extended over 12 cM from 131.07 cM (between D6S1715 and D6S292) to 143.69 cM, adjacent to D6S311. The linkage observed in the 6q23 region fulfilled the criteria of Lander and Kruglyak15 for genome wide significance. There is debate as to whether models that are not independent require correction for multiple testing. In the case of our study, however, the findings remained significant even after conservative Bonferroni correction for the number of diagnostic models.

In order to refine the linkage interval on chromosome 6q, we typed 42 additional microsatellite markers on the long arm of chromosome 6q between D6S1570 (99.01 from the pter) and D6S281 (190.14 from the pter) in the same sample that was used for the genome scan. We report the results of multipoint parametric and nonparametric and single point linkage analyses that demonstrate increased support for a consistently localized linkage peak and a substantially reduced region likely to contain a susceptibility gene.

Materials and methods

Family ascertainment and diagnostic methods

As described previously in greater detail, the family sample for this study was systematically recruited from the catchment area of the Taibe Regional Mental Health Center in the central region of Israel.14 The project was approved by the Helsinki Committee (Internal Review Board) of the Hadassah – Hebrew University Medical Center and written informed consent was obtained from all subjects. All available and potentially informative family members were interviewed with the Schedule for Affective Disorders and Schizophrenia- Lifetime Version (SADS-L)16 and were questioned about psychiatric symptoms in the family according to the Family History Research Diagnostic Criteria (FH-RDC)17 Lifetime diagnoses were established according to the Research Diagnostic Criteria (RDC)18 and the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV)19 using a best estimate consensus procedure20 applied to all available sources of information. Diagnostic evaluations were completed without knowledge of the genotyping data. Three diagnostic categories were established: (i) schizophrenia (n=52) and schizoaffective disorder, depressed (n=7) according to RDC (Narrow); (ii) narrow diagnoses plus schizoaffective disorder, manic (n=3) or depressed and manic (RDC) (n=4) and unspecified functional psychosis (RDC) (n=2) (Core); (iii) core diagnoses plus brief psychotic episode according to DSM-IV (n=1), schizophrenia missing one criterion according to RDC (probable schizophrenia) (n=1) and psychotic disorder not otherwise specified according to DSM-IV (n=1), schizotypal features according to RDC (n=2) and also affective psychotic diagnoses according to RDC (ie bipolar disorder, n=1, and major depression with psychotic features, n=1) (Broad). The sample included 21 families (155 individuals with DNA; average number of affected subjects 2.8 per family for the narrow diagnostic category, 3.2 for the core category and 3.6 for the broad category). In all, 12 families were nuclear, having one sibship with two or more affected members while nine were extended with two or more sibships containing affected members.

Genotyping

As for the original genome scan, genotyping of the additional microsatellite markers on chromosome 6q was performed at the Center for Genomic Technologies of the Hebrew University, Jerusalem using microsatellite markers, chosen from the MDC-Genethon microsatellite maps (http://www.genlink.wustl.edu/genethon_frame/). In all, 42 markers were chosen between D6S1570 (99.01 cM from the pter) and D6S281 (190.14 cM from the pter). The average intermarker distance was 1.7 cM for the entire region. The coverage extended well beyond our putative linkage region defined by NPL-1 so as to include regions implicated by other linkage studies of schizophrenia on chromosome 6q at a greater level of resolution than in the original genome scan. In the 23 cM region between D6S1715 and D6S311 (NPL-2 in the genome scan), markers were more closely spaced with an average intermarker distance of 1.1 cM.

Individual markers (fluorescence-labeled primers from Applied Biosystems, (Foster City, CA, USA)) were amplified in a PTC 225 DNA Engine (MJ) using 25 ng of DNA, 6 pmoles (0.6 μ M), of each primer, 1.5 mM MgCl2, 0.15 mM dNTP's, 1 × PCR Gold buffer (15 mM Tris-HCl, pH 8.0, 50 mM KCl) and 0.4 U of AmpliTaq Gold DNA polymerase (both from AB) in a total volume of 10 μl with initial 12 min denaturation at 95°C, 10 cycles of 15 s at 94°C, 15 s at 55°C and 30 s at 72°C, 25 cycles of 15 s at 89°C, 15 s at 55°C and 30 s at 72°C, 10 min at 72°C and forever at 12°C. After amplification, labeled PCR products were pooled, up to 12 markers together, and 1–2 μl were sampled into 9 μl of loading buffer (formamide with GENESCAN 400HD [ROX] size standard (Applied Biosystems). PCR product electrophoresis and detection were performed using a 3700 Automated DNA Analyzer (Applied Biosystems, Foster City, CA, USA). Sizing and genotyping were performed using GENESCAN and GENOTYPER software (Applied Biosystems).

Checking for genotyping errors and for deviation from Mendelian expectations employed the program PedManager (made available by Mark Daly, Massachusetts Institute of Technology). Inconsistencies were resolved by retyping the markers for the specific subjects; if not resolvable, the genotype at the particular locus for these subjects was not considered. The overall laboratory error rate was very low21 (0.6%, when considering the nonresolvable errors only and 1.1% when including resolvable inconsistencies).

Linkage analysis

Relative map positions of the markers were determined according to the distances reported in the Marshfield genetic maps (http://research.marshfieldclinic.org/genetics/Map_Markers/ mapmaker/MapFormFrames.html). Both the newly typed markers and those from the original genome scan14 were included in the analysis. The information content of the markers is given in Table 1. The average polymorphism information content of the dense map used in the current study was 0.93 compared to 0.92 for the genome-scan markers on chromosome 6q. Multipoint nonparametric and parametric (dominant and recessive models) analyses were performed with the Allegro22 program, using the Spairs option and applying an affected-only method, where all individuals that are unaffected are treated as phenotype unknown. Allelic frequencies for the microsatellites were estimated from a panel of unrelated, unaffected subjects randomly chosen from the same population as the pedigrees (n=26) and also from the married-in, unaffected subjects of our families; the frequencies did not differ from each other. In addition, in order to guard against possible bias, we reran the multipoint parametric analysis and a subset of the single point analyses using allele frequencies derived from the sample. As in the genome scan analysis,14 the parameters for the dominant model were: q (frequency of the affected allele)=1%, penetrance for fAa=faa=90% and phenocopy rate=0.01%. (fAA). The parameters for the recessive model were: q=10%, penetrance faa=90% and phenocopy rate=0.01% (fAA=fAa). For the single point analyses, the same parameters were applied. Mlink from the Fastlink 4.1 package was employed.

Table 1 Multipoint parametric analysis under two genetic and three diagnostic models

Calculation of significance levels

Taking into account the three diagnostic categories there were three models for the nonparametric linkage analysis and with the two inheritance models, there were a total of six models for the parametric linkage analysis. As stated above, the issue of multiple testing incurred by the usage of several inheritance and phenotypic models is not fully resolved. Clearly, these different models are dependent on each other, and a simple Bonferroni correction would be too conservative. Nevertheless, even after applying the Bonferroni correction our significant NPL scores (see below) remained below the threshold suggested as significant by Lander and Kruglyak15 for genome scan allele sharing analysis (P-value below 4.2 × 10−4). Furthermore, simulation studies in our previous analysis14 have shown that the chance rate of NPL above 4 in our study design was 0%. This study is aimed at narrowing down the linked region identified previously, and thus is not a genome scan at all. The increased LOD and NPL scores we have obtained and the narrower region of linkage (see below) are consistent with our previous results and with their interpretation as a true signal. Thus, we chose not to correct our LOD and NPL scores, and interpret their significance according to Lander and Kruglyak,13 that is, parametric LOD score above 3.3 and NPL with a P-value below 4.2 × 10−4.

Family-based association test

Family-based association analysis employed the program FBAT,23, 24 applying a dominant, multiallelic model and employing the empirical-variance estimator to account for correlations among affected siblings due to linkage.

Results

The results of the multipoint, nonparametric analysis are shown in Figure 1. The peak NPL was 4.98 (P=0.00000058) at D6S1626 (136.97 cM), immediately adjacent to D6S292 (NPL 4.98, P=0.00000068), the marker that gave the highest NPL in the original genome scan, under the broad diagnostic category. The putative susceptibility region, as defined by a decrement of 1.0 in the NPL (NPL-1), was 4.96 cM extending from midway between D6S1722 (133.28 cM) and D6S976 (135.47 cM) to midway between D6S1009 (137.74 cM) and D6S1569 (141.15 cM). The NPL-2 interval is 21.0 cM, extending from D6S407 (125.71 cM) to midway between D6S1649 (146.06 cM) and D6S311 (147.13 cM). Under the core diagnostic category the maximum NPL was 4.67 (P=0.0000026) and under the narrow diagnostic category 3.88 (P=0.000036), both at D6S1626.

Figure 1
figure 1

Multipoint, nonparametric linkage analysis of microsatellite markers on chromosome 6q under broad, core and narrow diagnostic models and showing a maximum NPL of 4.98 (P=0.00000058) at 136.97 cM (marker D6S1626) and peaks of 4.67 (P=0.0000026) and 3.88 (P=0.000036) at the same location under the core and narrow diagnostic models, respectively.

Table 1 shows the results of the multipoint parametric analysis under the three diagnostic and two genetic models in the 20 cM region between D6S1715 (125.71 cM) and D6S1649 (146.06 cM). The maximum multipoint LOD score was 4.63 under a dominant model, the core diagnostic category and homogeneity, at the adjacent markers D6S1626 and D6S292 (136.97 cM). The LOD-1 interval was 2.13 cM. Under the broad diagnostic category, the maximum LOD score was 3.82, at the same markers, also under a dominant model and homogeneity. The results did not differ appreciably when we ran the multipoint parametric analysis using allele frequencies derived from the linkage sample instead of from controls and unaffected married in subjects. For the two best markers the LOD scores were as follows: For D6S1626, the LOD score with control allele frequency (under the broad diagnostic category) was 3.82; with allele frequency derived from the sample, 3.81. For D6S292 the respective LOD scores were 3.82 and 3.80.

Table 2 shows the results of the single point analyses under the three diagnostic and two genetic models. The maximum single point LOD score was 3.55 (θ 0.01) at D6S1626 (136.97 cM) under the broad diagnostic category. As indicated in Table 3, single point LOD scores exceeding 2.0 were observed at markers 4–6 cM centromeric and telomeric to the marker showing the maximum LOD score. Examining a subset the markers that showed the highest LOD scores using allele frequencies derived from the sample yielded highly similar results to those obtained when using allele frequencies derived from control and unaffected married in subjects.

Table 2 Single point parametric analysis under two genetic and three diagnostic models
Table 3 Selected brain-expressed genes with potential pathophysiological relevance to schizophrenia on chromosome 6q in the 4.96 cM NPL-1 region delineated by fine mapping of an Arab-Israeli family sample (further details and references available from the authors)

As a result of the genetic isolation and high levels of consanguinity of the sample and the relatively dense marker map, the degree of LD is likely to be high across large stretches of DNA. Therefore, we performed a family-based association analysis of the 58 markers using the program FBAT,23, 24 applying a dominant, multiallelic model and employing the empirical-variance estimator to account for correlations among affected siblings due to linkage. The results did not show evidence for biased transmission at any of the markers (P>0.1 in all cases).

Discussion

In the search for schizophrenia susceptibility genes, an encouraging pattern is beginning to emerge whereby replicated associations with positional candidates are being reported in chromosomal regions to which loci were previously mapped by linkage analysis.25, 26, 27, 28, 29, 30 An important step in such an endeavor is to strengthen the support for linkage and to narrow the size of the putative susceptibility region. The present study significantly strengthens the evidence for a schizophrenia susceptibility locus at chromosome 6q23. With a greatly increased marker density the position of the linkage peak remained unchanged at 136.97 cM in nonparametric and parametric multipoint analyses and single point analysis. The strength of the linkage increased from an NPL of 4.60 to 4.98 (broad diagnostic category) and in the multipoint parametric analysis from a LOD score of 4.16 to 4.63 (core diagnostic category, dominant model). The size of the linked region demarcated by NPL-1 decreased from 12.0 to 4.96 cM. The LOD-1 interval defined by the multipoint parametric analysis was even smaller covering only 2.1 cM.

A region of this size is potentially amenable to linkage disequilibrium mapping directed at identifying putative susceptibility genes associated with the disorder. Such an approach has a reasonable likelihood of success given the origin and nature of our sample. The Arab Israeli population is an ethnically homogeneous group that has a high birthrate, an unusually high level of consanguinity and a low rate of intermarriage with other population groups in Israel.31, 32 The sample that we have studied was recruited from three Arab Israeli towns that were founded approximately 200–250 years ago by a limited number of families.33 In subsequent years there was immigration into the towns but the major population increase has been due to a high birthrate and low infant mortality in the past 75 years. Traditionally, marriages are within the community, often within the same extended patrilineal clan.34 In our sample, out of a total of 88 marriages, 26 (29.5%) were between first cousins. This level of consanguinity would suggest an increased liability to recessive traits, although it is noteworthy that our best scores come from the dominant model; a striking increase in congenital malformations has indeed been reported in these towns.34 The uniqueness of the sample does not reside only in the high level of consanguinity but in its homogeneity and recent origin. Thus, the degree of linkage disequilibrium (LD) is likely to be high across large stretches of DNA, allowing the genes involved in a complex trait like schizophrenia to be efficiently mapped.

Rather than focusing only on the region abutting the linkage peak in our genome scan, we typed additional microsatellite markers over most of the long arm of chromosome 6. The reason for this decision was that linkage findings in schizophrenia on chromosome 6q have been spread over rather a large region14 and we thought it prudent to exclude the possibility that additional linkage peaks might have been missed in our genome scan. This was not the case and the linkage peak at chromosome 6q23 emerged as well demarcated without evidence for additional peaks in the region between D6S1570 (99.01 from the pter) and D6S281 (190.14 from the pter).

Our findings should be considered on the background of relevant previous studies. The first report of a possible schizophrenia locus on chromosome 6q came from Cao et al35 who found excess allele sharing for markers in the 6q13–26 region, which was supported by a follow-up study from the same group36 and by a multicenter, collaborative study.37 Strong evidence for linkage to schizophrenia was observed more telomerically at 6q25 by in a very large Swedish pedigree.38 Nominal support for linkage of schizophrenia on chromosome 6q was provided by a genome scan of five Austrian families39 and in 30 African-American pedigrees.40 (These findings are reviewed by Lerer et al14). Besides reports of linkage of schizophrenia to chromosome 6q, there have recently been several intriguing reports of linkage to bipolar disorder on chromosome 6q. Most striking is the report of Middleton et al41 who found significant evidence for linkage of bipolar disorder at 125.8 Mb with an NPL of 4.20 and maximum LOD score of 3.56. The findings derived from a high-density SNP genotyping assay in 25 Portuguese families. More distant from our locus, McInnis et al42 found, in the context of a genome scan, an NPL of 2.5 (P=0.008) at D6S311 (147.5 cM) in 153 bipolar families from the NIMH Genetics Initiative, under a broad diagnostic model. In a genome scan of 250 pedigrees (collected under the NIMH Genetics Initiative but independent of previously reported bipolar pedigrees). Dick et al43 found a maximum LOD score of 2.20, at 114 cM, near the marker D6S1021, also under a broad diagnostic model. It is of interest that the highest NPL obtained in our sample was under the broad diagnostic model that we applied (although there were actually only two individuals with affective diagnoses included). Also, Park et al44 recently reported supportive evidence for linkage of psychosis in bipolar pedigrees to 6q21. The possibility that a single gene on chromosome 6q may predispose to both schizophrenia and bipolar disorder must be considered along with the hypothesis that more than one gene for major psychiatric disorder is located in this genomic region.

The 4.96 cM region delimited by NPL-1 in our sample contains, based on our search of the Refseq database, 45 known genes of which 32 genes have been shown to be expressed in the brain. We have narrowed down the candidate genes in on the basis of their putative function and potential relevance to schizophrenia, on the basis of crossreferences by the means of a PubMed search. In all, 13 candidate genes were identified as shown in Table 3. While the highest priority is to narrow down the linked region still further, a gene-based, prioritized, systematic study of these candidate genes, or preferably a subset, in relation to schizophrenia could be indicated.45

Recently, Duan et al46 reported a possible susceptibility gene for schizophrenia in the chromosome 6q23 region. They genotyped 192 pedigrees with schizophrenia of European and African-American (AA) descent from samples that previously showed linkage to 6q13–q26, focusing on the MOXD1-STX7-TRARs gene cluster at 6q23.2. Significant association, after correction for multiple testing, was observed with a single SNP, rs4305745, located 1214 bp downstream from the stop codon of the trace amine receptor 4 (TRAR4) gene at 6q23.2. Two additional SNPs 3′ to rs4305745 and in perfect linkage disequilibrium with it showed association with schizophrenia. A potential role of trace amine receptors in schizophrenia is plausible given the long-considered role of biogenic amines in this disorder and its drug therapy and the fact that these receptors are activated by endogenous amines with amphetamine-like activity.47, 48 Pending independent replication, the finding of Duan et al35 is potentially consistent with our refined linkage findings in that the TRAR4 gene is located 4 Mb proximally to our linkage peak although it is outside of our NPL-1 interval. This raises the intriguing possibilities that either the same susceptibility gene segregates in these two populations, or else, that at least two distinct such alleles map in this chromosomal region. Further studies are required to sort this out. Overall, the confluence of linkage findings in schizophrenia and bipolar disorder and the report of a putative susceptibility gene for schizophrenia strongly motivate further intensive efforts to identify one or more susceptibility genes for major psychiatric illness in the chromosome 6q region.